CANADIAN NOV ¥ 1958 


OURNAL OF MATHEMA‘’HAIRCS 


Journal Canadien de Mathématiques 


VOL. X- NO. 4 
1958 
The chords of the non-ruled quadric in PG(3, 3) W.T. Tutte 481 
The chords of the non-ruled quadric in PG(3, 3) H.S.M.Coxeter 484 
An S-configuration in Euclidean and 
elliptic n-space Sahib Ram Mandan 489 
On the medians of a triangle in hyperbolic geometry O.Bottema 502 
Correction to ““Transitivities in projective planes’ T.G.Ostrom 4507 
On the number of dissimilar graphs between a given 
graph-subgraph pair Frank Harary 513 
Coverings of bipartite graphs 
A. L. Dulmage and N.S. Mendelsohn 517 
Oriented flat submanifolds in an affine space M. J. Englefield 535 
On the lattice of topologies Juris Hartmanis 547 
Exemple effectif d'un ensemble transfiniment 
non-projectif F. Rothberger 554 
Congruence representations in algebraic 
number fields Eckford Cohen 561 
Well distributed sequences 
F. R. Keogh, B. Lawton and G. M. Petersen 572 
Further identities and congruences for the 
coefficients of modular forms Morris Newman 577 
On bounded matrices with non-negative elements C.R. Putnam 4587 
Orthogonal polynomials and hypergeometric series 
A. van der Sluis 592 
On the inversion of the Gauss transformation, II P.G. Rooney 613 
Generalized integrals with respect to functions of 
bounded variation R. L. Jeffery 617 
On the Denjoy conjecture James A. Jenkins 627 
On Babinet’s principle R.C. MacCamy 632 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, H. Zassenhaus 
with the co-operation of 


A. D. Alexandrov, R. Brauer, W. P. Brown, D. B. DeLury, J. Dixmier, 
P. Hall, I. Halperin, P. Scherk, J. L. Synge, A. W. Tucker, 
W. J. Webber, M. Wyman 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 
and reference system should be carefully thought out. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of 
recognized Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Universite Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 


National Research Council of Canada 
and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











THE CHORDS OF THE NON-RULED QUADRIC 
IN PG(3, 3) 


W. T. TUTTE 


The 8-cage (3) may be defined as the simplest cubic graph having no 
circuit of fewer than eight edges. To construct it we first observe that it 
must contain a tree 7 whose vertices are of degrees 1 and 3 and in which 
each vertex of degree 1 is separated by an arc of just three edges from a 
central edge AB. These properties fix the structure of JT uniquely. Starting 
with T we join the vertices of degree 1 by new edges so as to form a cubic 
graph, taking care to introduce no circuit of fewer than eight edges. It is 
found by trial that this can be done in essentially only one way. The resulting 
graph is the 8-cage. It has 30 vertices and 45 edges. 

The 8-cage is 5-regular, that is, if S and S’ are any two oriented arcs of 
length 5 in it there is a unique automorphism of the 8-cage transforming S 
into S’. Hence it may be calculated that the automorphism group of the 
8-cage is of order 1440. It is shown in (3) that if in any s-regular graph the 
length of the shortest circuit is m, then s < 4m-+ 1. Thus if s = 5 then 
m> 8. Accordingly the 8-cage is the simplest 5-regular cubic graph. It is 
also shown that there is no s-regular cubic graph such that s > 5. The 8-cage 
has therefore been called “‘the most regular of all graphs’’ (1). 

In (1) the 8-cage is exhibited as the Levi graph of the Cremona-Richmond 
configuration. The object of the present note is to describe another geometrical 
occurrence of the graph. 

Let P denote the finite 3-dimensional projective space PG(3, 3) whose 
40 points have homogeneous co-ordinates x, y, z, t over the field of residues 
mod 3. Let Q denote the quadric x? + y? + 2? — # = 0 in P. The points of 
Q are the 4 points for which ¢t = 0 while the remaining co-ordinates are non- 
zero, together with the 6 points for which two of the first three co-ordinates 
are zero and the other two co-ordinates are non-zero: 10 in all. It is easily 
verified that each plane of P is on at least one of these 10 points and that 
each tangent plane of Q is on just one of them. Thus Q has no generators. An 
account of the geometry of the “ellipsoid’’ Q is given in (2); here we are 
concerned only with the relation of this geometry to the 8-cage. 

Let V be the set of the 30 points of P not on Q, and let E be the set of the 
45 lines of P meeting Q in two distinct points. We can regard V and E as 
sets of vertices and edges respectively of a graph G, the intersections of the 
edges in points of Q being regarded as irrelevant to the graph structure. We 
proceed to show that G is an 8-cage. 


Received September 24, 1957. 











482 W. T. TUTTE 


If X € V the polar plane of X meets Q in the four points of a proper conic. 
The lines of E through V meet the other six points of Q in pairs. Hence G is 
cubic. It is now only necessary to show that G has no circuit of fewer than 
eight edges. 

Let vo, 01,..-.,U, = Vo be the vertices of any circuit of G, taken in their 
natural cyclic order. We have n > 4, since no six points of Q are coplanar. 
We write E, for the edge joining v; and v,,; and F;, for the edge incident with 
v, but not with v,-; or v44;. (Addition in the suffices is mod.) Since no 
five points of Q are coplanar we have the rule that no two of the five edges of 
G incident with 9; or 944; have a common point on Q. In other words, each 
point of Q is on just one of these five edges. 

Suppose E> is on the points A and B of Q. Then by the rule just stated 
neither A nor B is on any other edge of G incident with x or v2, but each is 
on some edge of G incident with v2 or v3. Hence each of A and B is on one 
of the lines E; and F;. Since neither of these lines is Ep we may suppose E; 
is on A and F; is on B. Applying the same argument, but going round the 
circuit in the other direction, we find that each of A and B is on one of the 
lines E,3 and F,-». (Each of these is incident with v,_2.) These results have 
the following consequences: 


(i) If m = 4 then A is on two edges of G incident with v2 or 03. 
(ii) If m = 5 then A is on two edges of G incident with 2; or v4. 
(iii) If m = 6 then B is on two edges of G incident with 2; or 24. 
(iv) If m = 7 then A is on two edges of G incident with » or 2s. 


Applying our rule we deduce that m > 8. Thus each circuit of G has 8 or 
more edges. 

Now let the points of Q be denoted by the letters a to j according to the 
following scheme: 


a (1, 0, 0, 1), b (1, 0, 0, 2), 
e(0,0,1,1), f (,0,1,2 
i (1 


) c (0, 1,0, 1), d (0, 1, 0, 2), 
), 
5 ay Bee j (1, 2, 2, ¢ )). 


g(1,1,1,0),  &(1,1,2,0), 


Postmultiplication of the co-ordinate vectors by the matrices 


201 . 2010 
202 and 1010 
pan i 0001 
0100 0100 


effects the permutations R = (ajecifdg) (bh) and L = (ajbhecgfdi) of the points 
of Q. These induce automorphisms 7 and / of G. 

There is an oriented arc S of G of length 5 determined by the sequence 
(0,0, 1,0) (0,0,0, 1) (0,1,0,0) (1,0, 1,0) (1,0, 2,2) (1,1, 2, 1) of vertices 
of G. Its edges, taken in the corresponding order, are ef, cd, gi, af, dj. The 
automorphisms r and / convert S into oriented arcs given by the edge-sequences 





Ur 








CHORDS OF THE NON-RULED QUADRIC 483 


(cd, gi, af, dj, eg) and (cd, gi, af, dj, bi), respectively. Using the theory of 
(3, § 3), we deduce that r and / generate the whole group of automorphisms 
of G. Hence each automorphism of G is induced by some automorphism of Q, 
that is, some projective transformation of P under which Q is invariant. On 
the other hand, it is clear that at most one permutation of the letters a to j 
can correspond to a given automorphism of G. It follows that the auto- 
morphism groups of G and Q are isomorphic. 

Inspection of a diagram of G shows that when the edges through a specified 
point of Q are removed, the graph falls into two connected parts, each of 
these being a Thomsen graph with its edges once subdivided. 


REFERENCES 


1. H. S. M. Coxeter, Self-dual configurations and regular graphs, Bull. Amer. Math. Soc., 
56 (1950), 413-455. 

2. W. L. Edge, Geometry in three dimensions over GF (3), Proc. Roy. Soc. A222 (1953), 262-286. 

3. W. T. Tutte, A family of cubical graphs, Proc. Cambridge Philos. Soc., 43 (1947), 459-474. 


University of Toronto 











THE CHORDS OF THE NON-RULED QUADRIC 
IN PG(3, 3) 


H. S. M. COXETER 


1. Introduction. In the preceding paper, Tutte described the forty-five 
chords of an “‘ellipsoid’’ in the finite space PG(3, 3), showing that they may 
be regarded as the edges of a remarkable graph whose group is the group of 
automorphisms of the symmetric group Gs. The object of this sequel is to 
relate Tutte’s idea to Sylvester’s combinatorial investigation of the fifteen 
“‘duads”’ and fifteen ‘‘synthemes”’ formed by six symbols, and to Richmond's 
discussion of the figure of six points in a projective 4-space. (The syntheme 
12,34,56 is made up of the three duwads 12, 34, 56.) 


2. The relevant work of Edge. Edge (4, pp. 271, 275) observed that the 
non-ruled quadric in PG(3, 3) consists of ten points, and that the remaining 
thirty points in the space fall into two sets of fifteen, called “‘positive’’ and 
“‘negative,”’ which are in (3, 3) correspondence: each positive point is joined 
by chords to three negative points, and likewise for the other sign. Tutte (7) 
observed that consequently the thirty ‘“‘exterior’’ points and forty-five chords 
form a graph of degree three, which he identified with his 8-cage (3, p. 442).! 

The present note achieves this identification more quickly by a notational 
device. Since the thirty vertices of the 8-cage (Figure 1)? correspond to the 
duads and synthemes of six digits (6, p. 92), we merely have to associate the 
same symbols with the thirty points of PG(3,3) not lying on the quadric. 
This was actually done by Edge (4, p. 275) when he found that each of the 
fifteen negative points is the sole common vertex of two of the six “‘negative 
pentagons’ (complete pentagons whose ten edges are tangents), and that 
the fifteen positive points ‘‘answer one to each of the fifteen synthemes of 
negative pentagons.’’ However, a more direct procedure is to replace Edge’s 
six pentagons in PG(3,3) by six points in PG(4, 3). 


3. Richmond's contribution. The background for this idea was supplied 
by Richmond (5, pp. 127-129), who observed that the duads and synthemes 
provide an ideal notation for the “‘edges”’ and “‘transversals’’ of the hexastigm 


Received February 6, 1958. 


1] take this opportunity to correct an error on page 441: In the last two sentences of §8, the 
word ‘‘equilateral”’ occurs three times. The second and third occurrences should be deleted. 

*For simplicity, only the ‘‘duad”’ vertices have been marked. The synthemes can be im- 
mediately inferred; for example, the vertex that appears between 14 and 23 (near the top) 
is (14,23,56). 


484 





fo 
vel 


CHORDS OF THE NON-RULED QUADRIC 


23 
16 


35 





26 








Bes 


FiGuRE 1 


formed by six points of general position in any projective 4-space. The six 


vertices 1,...,6 of the hexastigm are joined in sets of two, three, or four by: 
15 edges such as the line 12, 
20 faces such as the plane 123, 


15 spaces such as the hyperplane 1234. 
Each edge meets the opposite space in a diagonal point such as 
Pr. = 12-3456. 


Three edges which together involve all the six vertices are met by a unique 
line called a transversal; for example, the line 


PoP uP se = 3456 -5612-1234 


is the transversal of the edges 12, 34, 56. The (3, 3) correspondence between 
edges and transversals (or between duads and synthemes) is seen in the fact 

















486 H. S. M. COXETER 


that each transversal meets three edges while each edge belongs to three 
transversals. Since the three transversals that involve the edge 12 all meet 
it in the same point P», the fifteen diagonal points and the fifteen transversals 
form a configuration 15; of the kind used by Baker in two of his frontispieces 
(1; 2). 

There are also fifteen harmonic points such as Qi2, the harmonic conjugate 
of Pi. with respect to / and 2, and ten points of ‘minor importance” (5, p. 
128) such as 

Piss = Pass = 123-456, 


the intersection of two opposite faces. The forty-five joins of these last ten 


points pass by threes through the fifteen harmonic points; for example, the 
lines 


Pras Pus, Piss P25, Pris Pos 


all pass through Q,2. Other triads of these forty-five lines meet the trans- 
versals; for example, the lines 


Piss Poza, P 123 P x2, Prr25 Ps 


all meet the transversal Py. P34 Ps. 


To check these results we may use Modbius’s barycentric calculus (1, p. 97; 
2, p. 115), which allows us to “weight” the six vertices in such a way that 


1+2+3+4+5+6=0. 
Then 


Pre=1+2 Pu=3+4, Pw=54+6; 
Q2=1-—2; Pw=14+24+3, Pin — Pox = 1 — 2 = Qi; 
Pisa + Poss = Pio + 2Pu, Pies + Pins = 2Pi2 + Pa, Pixs — Pos = Pur — Pu. 


4. Six points in PG(4, 3). We shall find that, when the co-ordinate field 
is restricted to GF(3), three new features present themselves: 


I. The ten ‘‘minor’’ points P;,, all lie in a hyperplane. 
Il. They form a non-ruled quadric in this 3-space. 
III. The three chords that meet a transversal all meet it in the same point. 


Using co-ordinates (x , X2, X3, X4, Xs) (mod 3), we take the vertices /,..., 
5, 6 to be 
Fae Gs es «+ sig SO Oe 2D, th, 2 1, §, £D. 


Then the spaces 3456 and 1234 are 
X1 = Xe and x, = 0, 
the diagonal points Pi: and Ps. are 


(1,1,0,0,0) and (1, 1,1, 1, 0), 








CHORDS OF THE NON-RULED QUADRIC 487 


the transversal Pj. P34 Ps is 


X, = Xe = 2X3 and Xe = x, = O. 


Proof of 1. The point 123-456 (denoted by Pi23 or P4s6) is 
(1, 1, 1, 0, 0). 


Permuting the five co-ordinates, we obtain the ten “‘minor” points, all lying 
in the hyperplane 


Xi txeotx3+%4,+%,=0 (mod 3). 


Proof of 11. These ten are the only points (in the 3-space) that lie on the 
quadric 


x1? + X29? + x3? + xq? + xs? = 0, 


which is non-ruled since its section by the tangent plane x; + x2 + x; = 0 
consists of the point of contact (1,1, 1,0,0) alone. In fact, our equation 
agrees with Edge’s “‘ellipsoid”’ 


e+yVr2+P+ xt+yt+2+0? =0 
(4, p. 274). 


Proof of 111. Since the chords lie in the 3-space while the diagonal points 
are outside, those chords which meet any one transversal must pass through 
the point in which that transversal meets the 3-space. An alternative proof 
is provided by the observation that, since 2 = — 1, the three symbols 


Pi2t+ 2Pxx, 2Pi2+ Ps, Pir — Pas 


(as well as P34 — Pe, Pss — Piz, etc.) all refer to the same point. Since this 
is the point of intersection of the 3-space =x = 0 and the transversal 
Pi. P34 Ps, it is naturally denoted by the syntheme (12,34,56). 


5. Conclusion. In this geometry every line contains just four points. In 
particular, the remaining two points on the chord Pi 34 Pes4 are 


Piss — Poss = Que and Pisa + Poss = (12,34,56). 


The thirty points of PG(3, 3) not on the non-ruled quadric are thus denoted 
by the fifteen duads such, as 12 (meaning Qj.) and the fifteen synthemes 
such as (12,34,56). Each of the forty-five chords joins one “‘duad”’ point to 
one “‘syntheme’”’ point, and the duad belongs to the syntheme. In other words, 
as Tutte observed, if we disregard the ten points that lie on the quadric itself, 
the forty-five chords are the edges of the 8-cage. 














488 H. S. M. COXETER 


Edge’s designation of the two sets of fifteen points as megative and positive 
can be justified by examining the co-ordinates. The ‘‘duad’’ points, such as 


Qu = (1, 2, 0, 0, 0), Ose = (1, a 1,1, 2), 
are “‘negative’’ since =x? = — 1; the “‘syntheme’”’ points, such as 
(12, 34, 56) = (1, 1, 2, 2,0), 


are “positive” since =x? = 1. 


REFERENCES 


1. H. F. Baker, Principles of geometry, vol. 2 (Cambridge, 1930). 

, Principles of geometry, vol. 4 (Cambridge, 1940). 

i. S. M. Coxeter, Self-dual configurations and regular graphs, Bull. Amer. Math. Soc., 56 

(1950), 413-455. 

4. W. L. Edge, Geometry in three dimensions over GF(3), Proc. Royal Soc., A222 (1954), 
262-286. 

5. H. W. Richmond, On the figure of six points in space of four dimensions, Quart. J. Math., 
31 (1900), 125-160. 

. J. J. Sylvester, Elementary researches in the analysis of combinatorial aggregation, Collected 
Mathematical Papers, vol. 1 (Cambridge, 1904). 

7. W. T. Tutte, The chords of the non-ruled quadric in PG(3, 3), Can. J. Math., 10 (1958), 
481-483. 





YN 


> 


University of Toronto 











AN S-CONFIGURATION IN EUCLIDEAN AND 
ELLIPTIC n-SPACE 


SAHIB RAM MANDAN 


Introduction. ‘The remarkable analogies which exist between the com- 
plete quadrilateral and the desmic system of points suggest that it may be 
possible to extend the properties considered above to spaces of higher dimen- 
sions’, remarks Prof. N. A. Court at the end of his paper (2). Here is an 
attempt in that direction in Euclidean as well as in elliptic 4-space, suggesting 
extensions in higher spaces. The corresponding figure, called an S-configuration, 
is discussed. Its vertices lie in pairs on the edges of a simplex, separated har- 
monically by the respective pairs of vertices of the simplex, called its diagonal 
simplex in analogy with the diagonal triangle of a quadrilateral in a plane 
and a diagonal tetrahedron of a desmic system in a solid. The vertices of the 
dual of an S-configuration form a closed set of 2" points w.r.t. their diagonal 
simplex such that all quadrics for which the simplex is selfpolar, passing 
through one of them pass through all of them, and each vertex is the harmonic 
inverse of every other w.r.t. a pair of opposite elements of the simplex. The 
S-configuration reduces to a cross polytope and its dual to a hypercube reci- 
procal to this polytope, when a cell of the diagonal simplex recedes to infinity 
as a selfpolar simplex for the absolute polarity, while the remaining vertex 
of the simplex (opposite this cell), is the common centre of the polytopes. 
This is analogous to a desmic system and its conjugate reducing respectively 
to an octahedron and a cube, when a face of the diagonal tetrahedron recedes 
to infinity as a selfpolar triangle for the absolute polarity while the remaining 
vertex of the tetrahedron (opposite this face) is the common centre of the 
polyhedra. The midpoints of the segments determined by the pairs of opposite 
vertices of an S-configuration lie in a hyperplane, called its Newtonian hyper- 
plane. 

The centres of similitude of a set of hyperspheres, with centres at the 
vertices of a simplex, taken in pairs, form an S-configuration with the given 
simplex as its diagonal simplex. 


1. SPACE OF FOUR DIMENSIONS 
1. Construction. 


(a) Let Q, (A, w = A, B, C, D, E, \u = wr, » # yw) be the traces of a given 
solid on the ten edges Aw of a given simplex (S) = ABCDE, and P),, the 
harmonic conjugates of Q,, for the corresponding pairs of vertices of (,S). 


Received February 5, 1958. 
489 











490 SAHIB RAM MANDAN 


The twenty points P,,, Q,, are said to form an S-configuration (S-C) with 
Pyy, Quy as ten pairs of its opposite vertices and (S) as its diagonal simplex 
in analogy with the diagonal triangle of a complete quadrilateral in a plane 
and a diagonal tetrahedron of a desmic system. 

(b) Rays = QueQnQu(vy = A, B, C, D, E; \,u # v) is a triad of collinear 
points on the line of intersection of the given solid with a plane face Ay of 
(S) and then Ay, = QuePaPru, Mraur = PurQnPru, Mur = PurPrQr, form three 
other triads of collinear points so that we have a complete quadrilateral 
(2) for which \yv is the diagonal triangle. Thus there are four triads of collinear 
points in each plane face of (S) and 4 X 10 = 40 in all, such that (40 X 3)/20 
= 6 of their forty lines pass through each point, two in each of the three 
plane faces of (S) through its edge on which lies the point considered. 

(c) The ten plane faces of (S) form the diagonal triangles of the ten quadri- 
laterals determined by the ten tetrads of lines enumerated above. There are 
forty more quadrilaterals determined by these forty lines. Eight of them lie 
in each cell of (S) that together with the four in the four plane faces of the 
cell form a 12, configuration of Poncelet and Reye (3, p. 473)—twelve points 
lying by sixes in twelve planes, six planes through each point. They are 
enumerated in the following five octads of planes. 


Pe = RryRprokvrRoru, Por = RyyMyavemMsorlory, 
Gq = Lyre vorxNoru, jw = LyrRuvel orxMory, 
To = MyyyMyvEM yenMeayryg, y = My pv] pvoR vox Ory, 
te = Myolyrolvorlorg, ter = MpyrMyveM Rory, 


where 06, @ = A, B,C, D, E:\, nu, v ¥O¥ 6. 

It may be observed here that four of these forty planes pass through each 
of the forty lines (§1(b)) and twelve through each vertex of the (S-C). 

(d) The ten points Q,,(§1(a)), the ten lines k,,,(§1(b)) and the five planes 
pr(§1(c)) form a Desargues’ 10; configuration as the intersection of the given 
solid with the different elements of the simplex (.S). There are sixteen such solids 
containing such configurations, one each. The following list shows that a 
pair of them pass through each of the forty planes (§1(c)): 


So = PaPsPcPope, Sy = PpateloPoPe; 
$1 = Gata? cQp’ Te, Sv = Gada'de'dv de; 
S2 = Talal cl Dre, Sex = TaPaTcl pT rR; 
$3 = lLadabctptez, Sy = lartgtetpte, 

Ss = da'Pe Po Pope’, Sv = QatatcPo' Pe’, 
Ss = PalteTo'dn' We’; Ss) = Pa'UaIcUv Ve’; 
So = taf pT cl pre’, Se = ta Pade pT e’, 
Sy = Ta Qe Pctp te’, Sy, = Palplteortyle. 


It is readily seen that four of these solids pass through each of the forty lines 
(§1(b)) and eight through each vertex of the (S-C). 








in 


eo fo @® @ 


we 





AN S-CONFIGURATION 491 


(e) Thus the S-configuration is 
20(., 6, 12, 8)40(3, ., 4, 4)40(6, 4, ., 2)16(10, 10, 5, .) 


in Baker’s notation for configurations (1). 


2. The dual of an S-configuration. 
(a) The dual (R.S-C) of an S-configuration is thus 
16(., 5, 10, 10)40(2, ., 4.6)40(4, 4, ., 3)20(8, 12, 6, .), 


that is, it constitutes sixteen points, forty lines, forty planes, and twenty 
solids such that: 

Through each point pass five lines, ten planes, and ten solids. 

Each line contains two points, and four planes and six solids pass through 
it. 

Each plane contains four points and four lines, and three solids pass through 
it. 

Each solid contains eight points, twelve lines, and six planes. 

(b) In fact, we start with a point, say U(1, 1,1, 1,1), the unit point re- 
ferred to the given simplex (S), and join it to the ten plane faces of (5S) 
giving the ten solids u,; = u,(t # j; 1,7 = a, b, c, d, e), whereu, = 0, u, = 0 
represent the five solid faces of (S). These ten solids together with their ten 
harmonic conjugates u; = —wu, w.r.t. u; = 0 and u, = 0, two through each 
plane face of (S) separated harmonically by two of its solid faces through 
this plane face, constitute the twenty solids of the (R.S-C) for which (S) is 
said to be the diagonal simplex. 

The sixteen points or rather the sixteen vertices of the (R.S-C) are then no 
other than (+1, +1, +1, +1, 1) referred to (S), which form a closed set 
(7) w.r.t. it in the sense that all quadrics, for which (S) is selfpolar, passing 
through one of them pass through all of them, and every vertex of this con- 
figuration is an harmonic inverse of another w.r.t. a pair of opposite elements 
of (S). 

Thus: The vertices of the dual of an S-configuration form a closed set of sixteen 
points w.r.t. their diagonal simplex. 

(c) Let the secant from U to the edge DE and the opposite plane face ABC 
of (S) meet them in Ppg and Pp,» respectively. Ppg is then (0, 0, 0, 1, 1) and 
Pope is (1, 1, 1, 0, 0) referred to (S). Let U’ be such that (UPpgU'Ppe:) 
= —1. U’ is then (—1, —1, —1, 1, 1). The harmonic conjugate Qp,z(0, 0, 0, 
—1,1) of Ppg w.r.t. D, E lies in the polar (9) solid of U, viz. s» = Siu, = 0 
w.r.t. (S). Thus we can construct points like Pp, on the remaining nine edges 
of (S) other than DE and their harmonic conjugates w.r.t. the respective 
pairs of its vertices, and identify these twenty points with the ten pairs of 
opposite vertices of the (S-C) (§l(a)). Thus: The sixteen solids of an S-con- 
figuration (S-C) are the sixteen polar solids of the sixteen vertices of its dual 
(R.S-C) w.r.t. their diagonal simplex such that through every vertex of the (S-C) 














492 SAHIB RAM MANDAN 


pass the joins of four pairs of vertices of the (R.S-C) lying in one of its twenty 
solids corresponding to the vertex of the (S-C) considered. For example, through 
Pox pass the joins of four pairs of vertices (+1, +1, +1, 1, 1)—UU’ is one— 
lying in the solid ABCP pg. 


3. A cross polytope and hypercube. 


(a) The six pairs of points P,,, Qa,(A, » # E) (§1(a)) form a desmic system 
(2) of three tetrahedra such that any two of them are quadruply perspective 
from the vertices of the third, in the solid face ABCD of the simplex (5S). 
Let the two tetrads of planes ¢g and ég-( = p,q, r, t) (§1(c)) form respectively 
two tetrahedra T,: and T,-.. They are readily seen to be the other two diagonal 
tetrahedra (10) of the desmic system considered besides the one Tg = ABCD 
such that 7», T,, Tg form the conjugate system (6). 

Let the plane ABC recede to infinity, in which case the four lines Yagc 
(y = k,l, m,n) (§1(b)) and the six points P,,, Qy,(A, wu = A, B, C), lying in 
the same plane, likewise recede to infinity, thus leaving the three pairs of 
points Pp,, Op, respectively on the three edges DA, DB, DC of T, with D 
as the common midpoint of their segments. 

Let ABC be so chosen that it forms a selfpolar triangle for the circle at 
infinity. DA, DB, DC then form a rectangular system of axes. Let further the 
three points Pp, be equidistant from D. Pp,, Qp, then form the three pairs of 
opposite vertices of an octahedron with £,, &g as the four pairs of its parallel 
opposite triangular faces. Tg, Tg form a stella octangula (3, p. 378) whose 
vertices then form a cube reciprocal to this octahedron. 

Thus: A desmic system of points and its conjugate one reduce to an octahedron 
and a cube respectively, when a face of its diagonal tetrahedron recedes to infinity 
as a selfpolar triangle for the circle at infinity there, with centre at the vertex of 
the tetrahedron opposite this face. 

(b) Now let the solid ABCD recede to infinity, in which case the eight planes 
Ex, te (§3(a)), the sixteen lines y,,(A, u, » ~ E) (§1(b)) and the twelve 
points P,, Qy, (§3(a)), lying in the same solid, likewise recede to infinity, 
thus leaving the four pairs of points Pg, Qz, respectively on the four edges 
EA, EB, EC, ED of (S) with E as the common midpoint of their segments. 
The dual (R.S-C) of the S-configuration (§2(c)) becomes a parallelotope 
(5, p. 122) with E as its centre, and Pg, Q, become the centres of the eight 
parallelepiped faces of it. 

Let ABCD be so chosen that it forms a selfpolar tetrahedron for the absolute 
polarity. EA, EB, EC, ED then form a rectangular system of axes. The (R.S-C) 
is then an orthotope (5, p. 123). Further, let the four points Pg, be equidistant 
from E. Pg, Qe then form the four pairs of opposite vertices of a cross 
polytope 8,(3, p. 376) with s;,, sy (i = 0,..., 7) (§1(d)) as the eight pairs 
of its parallel opposite tetrahedral faces, &, &- as its sixteen pairs of parallel 
opposite triangular faces and ~g, as its twenty-four edges. The (R.S-C) is 
now a hypercube 7, (5, p. 123) reciprocal to this polytope. 





— = - * = 





Q 


yf 


ow oO 


QR @ ff of 





AN S-CONFIGURATION 493 


Thus: The S-configuration reduces to a cross polytope and its dual to a hyper- 
cube reciprocal to this polytope, when a solid face of their diagonal simplex 
recedes to infinity as a selfpolar tetrahedron for the absolute polarity, with centre 
at the vertex of the simplex opposite this face. 

(c) A hypercube has eight pairs of opposite vertices, sixteen pairs of opposite 
parallel edges, twelve pairs of opposite parallel plane faces (i.e., twenty-four 
squares), and four pairs of opposite parallel solid faces (i.e., eight cubes). 
It may be asked here what happens to the other eight lines, sixteen planes, and 
twelve solids of the (R.S-C) that becomes a hypercube. The answer to this 
query lies in the enumeration of its eight diagonals joining the eight pairs of 
its opposite vertices, sixteen central rectangles determined by the sixteen 
pairs of its opposite parallel edges, and twelve central rectangular parallel- 
epipeds determined by the twelve pairs of its opposite parallel square faces. 

(d) When the (R.S-C) becomes a hypercube (§3(b)), the four pairs of 
tetrahedra 7, 7,, formed by the tetrads of planes &, &- (§§3(b), 1(c)) form 
four stellae octangulae (§3(a)) inscribed in four cubes with their common 
centre at E. These cubes are reciprocal to the four octahedra formed by the 
four sets of three diagonals of the cross polytope (§3(b)) reciprocal to the 
hypercube. The twenty-four square faces of these four cubes are readily 
recognized to be the eight triads of the central square sections of the eight 
cube faces of the hypercube. 


4. The Newtonian solid. 


(a) The ten midpoints of the ten segments determined by the pairs of opposite 
vertices of an S-configuration lie in a solid, referred to as its Newtonian solid, 
and form a Desargues’ (103) configuration there. 

We shall refer to this as a Newton's Theorem. 

(b) Conversely: If on the edges of a simplex pairs of points are marked har- 
monic to the respective pairs of its vertices and so that the midpoints of the ten 
segments so marked lie in a solid, the ten pairs of points marked form an S- 
configuration. 

This follows from the converse of the Newton's Theorem (2) in space. For 
the midpoints of the six such segments marked on the six edges of the tetra- 
hedron of a solid face of (S) lie in a plane, common to this solid and the solid 
of the ten points under consideration, leading to the desmic system of the 
six pairs of points marked in the solid face of (S) considered. Five such systems: 
in the five solid faces of (S) constitute an S-configuration and hence the 
proposition. 

(c) An S-configuration is determined by its diagonal simplex and its New- 
tonian solid. 

For a desmic system of points is determined (2) by its diagonal tetrahedron 
in a solid face of (S) and its Newtonian plane common to this solid and the 
given Newtonian solid. 














494 SAHIB RAM MANDAN 


(d) The ten harmonic conjugates, w.r.t. the pairs of opposite vertices of an S- 
configuration, of the points of intersection of the edges of its diagonal simplex 
with a given transversal solid, lie in a solid. 

This is a projective form of the Newton’s Theorem (§4(a)). 

(e) The pairs of points of contact of the pairs of hyperspheres coaxal with a 
given hypersphere and the circumhypersphere of a given simplex touching its 
edges form the pairs of opposite vertices of an S-configuration. 

The pairs of points of contact under consideration on each edge of (S) form 
respectively the united elements or foci of the involution determined by the 
pairs of its intersections with the family of coaxal hyperspheres considered. 
They therefore are separated harmonically by the respective pairs of vertices 
of (S), being the intersections of its edges with its circumhypersphere that 
belongs to the family. Again the midpoints of their segments evidently lie 
on the radical solid of the family and hence the proposition (§4(b)). 


5. Centres of similitude of five hyperspheres. 


(a) The centres of similitude of five hyperspheres taken two at a time form an 
S-configuration the diagonal simplex of which has for its vertices the centres of 
the given hyperspheres. 

The centres of similitude of a pair of hyperspheres are defined in (8), by 
analogy with those of a pair of spheres, as a pair of points dividing the seg- 
ment between their centres in the ratio of their radii or as the double points 
(10) of the involution determined by their centres and their limiting points 
that represent the two zero-hyperspheres belonging to their family of coaxals. 
Thus: 

(i) The centres of similitude of a pair of hyperspheres are the same as those 
of their great spheres or great circles lying in a solid or a plane through the 
line of their centres as its sections with them. 

(ii) The centres of similitude of three hyperspheres taken two at a time 
are the same as those of their great spheres lying in a solid through the plane 
of their centres, or of their great circles in this plane itself, as its sections with 
them, and therefore form the three pairs of vertices (10) of a quadrilateral in 
this plane such that their centres form its diagonal triangle. 

(iii) The centres of similitude of four hyperspheres taken two at a time 
are the same as those of the great spheres in the solid of their centres as its 
sections with them, and therefore form the six pairs of vertices (2) of a desmic 
system such that their centres form its diagonal tetrahedron. 

(iv) The centres of similitude of five hyperspheres taken two at a time 
form a configuration such that those of four of them form a desmic system 
as its section with the solid of the centres of the four hyperspheres considered 
and hence the proposition (§3(a)). 

(b) Conversely: With the vertices of the diagonal simplex of an S-configuration 
as centres five hyperspheres may be drawn so that the pairs of its opposite vertices 
will be the centres of similitude of the five hyperspheres taken in pairs. 





~— 





AN S-CONFIGURATION 495 


We can draw four spheres with centres at the vertices of the diagonal tetar- 
hedron of a desmic system so that the pairs of opposite vertices of the system 
are the centres of similitude of the four spheres (2) taken in pairs. Therefore 
we can draw four such hyperspheres and consequently five hyperspheres 
satisfying the necessary conditions of the proposition. 

(c) The ten hyperspheres of similtude of five hyperspheres taken in pairs belong 
to a coaxal net, that is, they are orthogonal to a coaxal family of hyperspheres. 

The hypersphere of similitude (8) of a pair of hyperspheres is the one drawn 
on the join of their centres of similitude as diameter and therefore their 
centres are a pair of inverse points (10) w.r.t. this hypersphere. Hence any 
hypersphere through the centres of a pair of given hyperspheres is orthogonal 
to their hypersphere of similitude which is coaxal with them and therefore 
orthogonal to any hypersphere orthogonal to them. Thus the ten hyperspheres 
of similitude of five given hyperspheres taken in pairs are orthogonal to the 
one orthogonal to the given five and to the other through their five centres. 


6. Elliptic space.' 


(a) When deriving the elliptic 3-space from a 3-sphere in Euclidean 4- 
space by identifying antipodal points, it is observed by Coxeter (3, p. 478) 
that: ‘When antipodal points are identified, the four hexagonal central sections 
of a cuboctahedron yield the sides of a complete quadrilateral, and the twelve 
cuboctahedral central sections of {3, 4, 3} yield the twelve planes of Reye's 
configuration.’ The Reye’s configuration (§1(c)) is identical with a desmic 
system (§3(a)) that constitutes the regular honeycomb 4[38] of Coxeter 
(3, p. 478) in the elliptic space. 

It may be observed here that, when antipodes are identified, the three 
squares ¢,82, which are truncations of the three central squares 82 of an octa- 
hedron 83, give the twelve diagonals of the six square faces of the cuboctahedron 
t,8; which is a truncation of 83, and yield the three diagonals of the quadri- 
lateral yielded by the four hexagonal central sections of #,8;. In short, the 
cuboctahedron ¢,8; of a Euclidean space yields a complete quadrilateral and 
its diagonal triangle in an elliptic plane. 

Thus the four cuboctahedra which are truncations of the four central 
octahedra 8; of a cross polytope 8, (§3(d)) yield four quadrilaterals whose 
four diagonal triangles constitute a diagonal tetrahedron of the desmic system 
yielded by the 24-cell {3, 4, 3} or t:8, (5, p. 148) while the other eight central 
cuboctahedrons of ¢,8, yield the other eight quadrilaterals of the Reye’s 
configuration, whose diagonal triangles form the eight faces of the other two 
diagonal tetrahedra (§3(a)) of the desmic system. 

In fact, the 24-cell {3; 4, 3} = t:8, of a Euclidean 4-space yields, in elliptic 
3-space, a desmic system of three quadruply perspective tetrahedra and its 
conjugate formed by its three such diagonal tetrahedra. 


'The idea of elliptic space was suggested by the referee. 











496 SAHIB RAM MANDAN 


(b) Following the above chain of argument we may now expect and observe 
that: The truncation 1,8, (4) of a Euclidean 5-dimensional cross polytope Bs 
yields an S-configuration (S-C) and its diagonal simplex (S) in an elliptic 
4-space, when antipodal points are identified. For the five 24-cells {3, 4, 3} 
which are truncations of the five central cross polytopes 8, of 8;(5, p. 136) 
yield five desmic systems (§3(a)) which constitute the (S-C) derived from 
t,8s, and the ten cuboctahedra which are truncations of the ten central octa- 
hedra 8; of 6; yield ten quadrilaterals whose ten diagonal triangles (§1(c)) 
constitute the simplex (S) (§1(a)), and whose forty sides give the forty 
triads (§1(b)) of collinear points of the (S-C). The said ten central cubocta- 
hedra of #:85 lie by fours in each central ¢,8, of t,:85, the other five octads of 
central cuboctahedra (one octad in each central t,84) of 85 yield the forty 
quadrilaterals (§1(c)) of the (S-C). Finally the twenty diagonals of t:85 yield 
the twenty vertices (§1(a)) of the (S-C) and the sixteen pairs of parallel trun- 
cations ta, of the sixteen pairs of parallel opposite cells a, of 8s yield the six- 
teen (§1(d)) Desargues’ 10; configurations of the (S-C). For one tya, consists 
of five octahedra which are truncations of the five tetrahedral faces of an au, 
that contain twenty triangles and fifteen squares; thus the elements of a 
pair of parallel t,;a, will constitute a pentad of cuboctahedra which yield a 
pentad of quadrilaterals of the Desargues’ configuration. 

For further elucidation we work out some details of the 5-dimensional 
polytope ¢,6; following Coxeter (5, pp. 145-48, 158, 197-202) as follows. It is 


© or : 3 3 ; . 


Its elements consists of: 


3840/ (2.48) = 40 vertices, 











3840/(2.8) = 240 edges, © ar 3 
3840/(6.8) = 80 triangles, ———_—--— ©) a 
3840/(6.2) = 320 triangles, a 
3840/24 = 160 tetrahedra {3, 3}, © ones 
3840/ (24.2) = 80 octahedra 13} : a 
3840/120 = 324 3 
= 1@4 OTF 13, 3f 9 
3840/384 = 10 6, or {3, 3, 4}, 2 


The 40 vertices form its 20 diagonals. 


The 240 edges form its 40 central hexagons yielding the forty triads of 
collinear points of the (S-C). 











AN S-CONFIGURATION 497 


The 80 triangles and the 60 central squares of the 10 8, form its first central 
10 cuboctahedra yielding the diagonal simplex of the (S-C). 

The 320 triangles and the 240 central squares of the 80 octahedra form its 
other 40 central cuboctahedra. 

The 160 tetrahedra form its 10 ,. 


II. SPACE OF ” ‘DIMENSIONS 
7. S-configuration. 


(a) Analogously (§1(a)) we can define an S-configuration (S-C)n in an n- 
dimensional space as one constituted by 


("2") 

2 
traces of a hyperplane s on the edges of a given n-dimensional simplex (S)n 
and their harmonic conjugates w.r.t. the respective pairs of vertices on each 
edge of (S)n, which therefore is called analogously the diagonal simplex of the 
(S-C)n. 

Evidently: The section of an n-dimensional S-configuration by an r-dimen- 
sional face of its diagonal simplex is an r-dimensional S-configuration there with 
the r-dimensional simplex of the face considered as its diagonal simplex. 

Thus: The S-configuration in a solid and a plane face of the diagonal simplex 
of an n-dimensional S-configuration are the desmic system and the complete 
quadrilateral there respectively, as their sections with it, of which the diagonal 
tetrahedron and the diagonal triangle are those in the faces considered. 

(b) On each edge of (S)n there are two points of the (S-C)n, called a pair 
of its opposite vertices, separated harmonically by the pair of vertices of 
(S)n on the edge considered. In each plane face of (S)m there are then three 
pairs of opposite vertices of the (S-C)m on the three edges of this face, as the 
three pairs of opposite vertices of its quadrilateral section of the (S-C)n. Thus: 


The 
n+ ) 
("3 
vertices of an n-dimensional S-configuration lie by twos on the edges of its diagonal 
simplex separated harmonically by the corresponding pairs of vertices of the 


simplex, by threes on four lines in each plane face of the simplex as the six ver- 
tices of the quadrilateral formed by them, therefore by threes on 


n+ ‘) 
ae 
lines in all which then constitute the S-configuration. Through each vertex of it 
there pass two of its lines in each plane face of the simplex and 2(n — 1) in all. 


For through each edge of (S)m there pass (m — 1) of its plane faces and each 
vertex of the (S-C)m lies on an edge of (S)n. 











498 SAHIB RAM MANDAN 


(c) m independent lines through a point determine an n-dimensional space 
and (m — 1) of them a hyperplane. If through a vertex of the (S-C)n we take 
one of two lines (§7(b)) in each plane face of (S)m through that edge of S(n) 
on which lies the vertex of the (.S-C)m considered, we obtain 2”~' sets of (mn — 1) 
independent lines that determine 2"—' hyperplanes. 

Now if we take any two lines of the said (m — 1) lines determining a hyper- 
plane, we observe two more such lines completing a quadrilateral in their 
plane, intersecting in a vertex of the (S-C)n on that edge of (S)n which is the 
sixth edge of the tetrahedral face of (S)m determined by the two of its plane 
faces that contain the two lines considered, one in each, and five of its edges 
or those of its tetrahedral face under consideration (§7(a)). Thus each of the 
2"—' hyperplanes through a vertex of the (S-C)n meets every edge of (S)n in 
a vertex of the (S-C)n, or in other words, each such hyperplane contains 


n+ 1 
2 


vertices of the (S-C)n. Therefore there are 


o( +1) ore /("t1) - 


hyperplanes in all, of the type of the given one, viz. (§7(a)), we started with, 
that constitute the (S-C)n. Thus: The n-dimensional S-configuration consists 


of 
o("+ 1 
™ 


vertices and 2" hyperplanes such that through each vertex there pass 2"— (half 
the number) of its hyperplanes and each hyperplane contains 


("3") 


8. Dual of an S-configuration. 


(half the number) of its vertices. 


(a) The dual (R.S-C)n of an S-configuration is then one constituted by 


("3") 


hyperplanes through the (m — 2)-dimensional faces of a given simplex S(m) 
joined to a given point and their harmonic conjugates w.r.t. the respective 
pairs of cells through each (m — 2)- dimensional face of (.S)n which therefore 
is called analogously the diagonal simplex of the (R.S-C)n. 

(b) We can arrive at the 2" vertices (+1,...,+1,1) of the (R.S-C)n, 
referred to (S)n, in the manner we did for a four dimensional (R.S-C) (§2(b)). 











AN S-CONFIGURATION 499 


Thus: The vertices of an n-dimensional reciprocal of an S-configuration form a 
closed set of 2" points, w.r.t. their diagonal simplex, such that all quadrics, for 
which this simplex is selfpolar, passing through one of them passes through all 
of them, and each vertex is an harmonic inverse of every other w.r.t. a pair of 
opposite elements of the simplex (7). 

(c) Extending the idea of polarity (7) w.r.t. an n-dimensional simplex we 
can state (cf. §2(c)) that: The 2" hyperplanes of an n-dimensional S-configuration 
are the polar hyperplanes of the vertices of the reciprocal configuration w.r.t. 
their diagonal simplex such that through each vertex of the S-configuration there 
pass the joins of 2"~* pairs of vertices, of the reciprocal, lying in one of its 


o("t+l 
a 


hyperplanes corresponding to the vertex of the S-configuration considered. 


9. Cross polytopes and hypercubes. Following the line of argument above 
(§3(b)) we are now in a position to state that: The n-dimensional S-configura- 
tion reduces to an n-dimensional cross polytope and its reciprocal to an n-dimen- 
sional hypercube reciprocal to this polytope, when a cell of their diagonal simplex 
recedes to infinity as a selfpolar (n — 1)-dimensional simplex for the absolute 
polarity, with centre at the vertex of the simplex opposite this cell. 

The apparent deficiencies of elements of the hypercube as a reduction of 
an (R.S-C)n are supplemented by its diagonal elements (§3(c)), those of the 
cross polytope as a reduction of an (S-C)m lie at infinity (§3(b)). 


10. Newtonian hyperplanes. 
(a) The midpoints of the 


("3") 


segments determined by the pairs of opposite vertices of an n-dimensional S- 
configuration lie in a hyperplane, referred to as its Newtonian hyperplane. 
We shall refer to this as a Newton’s Theorem. This and the following 
results of this article and those of the next two articles follow by the method 
of induction (§4). 
(b) Conversely: If on the edges of an n-dimensional simplex pairs of points 
are marked harmonic to the respective pairs of its vertices so that the midpoints 


of the 
(" + ‘) 
2 


segments so marked lie in a hyperplane, the 


°3') 














500 SAHIB RAM MANDAN 


pairs of points marked form the pairs of opposite vertices of an n-dimensional 
S-configuration. 
(c) An n-dimensional S-configuration is determined by its diagonal simplex 
and its Newtonian hyperplane. 
n+ ') 
("3 


(d) The 
harmonic conjugates, w.r.t. the pairs of vertices of an S-configuration, of the 
points of intersection of the edges of its diagonal simplex with a given transversal 
hyperplane lie in a hyperplane (projective form of the Newton’s Theorem 
above). 

(e) As an immediate application of the above Theorem (§10(b)) we have: 
The pairs of points of contact of the pairs of the hyperspheres, coaxal with a 
given hypersphere and the circumhypersphere of a given simplex (n-dimensional), 
touching its edges form the pairs of opposite vertices of an n-dimensional S-con- 
figuration, the Newtonian hyperplane of which is the radical hyperplane of the 
family of the coaxal hyperspheres considered. 


11. Centres of similitude of » + 1 hyperspheres. 


(a) The centres of similitude of n + 1 hyperspheres taken in pairs form an 
S-configuration, the diagonal simplex of which is the central simplex of the given 
hyperspheres. 

(b) Conversely: With the vertices of the diagonal simplex of an n-dimensional 
S-configuration as centres (n + 1) hyperspheres may be drawn so that the pairs 
of its opposite vertices will be the centres of similitude of the n + 1 hyperspheres 


taken in pairs. 
n+ ') 
2 


(c) The 
hyperspheres of similitude of n +- 1 hyperspheres taken in pairs belong to a coaxal 
net, that is, they are orthogonal to a coaxal family of hyperspheres. 

Extending the argument above (§5(c)) to an n-dimensional space, we may 


note that the 
n+ 1 
2 


hyperspheres of similitude of ( + 1) given hyperspheres taken in pairs are 
orthogonal to the hypersphere, orthogonal to the given hyperspheres, and to 
the circumhypersphere of their central simplex, and therefore to the family of 
coaxals determined by these two hyperspheres. 


12. Elliptic space. When elliptic n-space is derived from an n-sphere in 
Euclidean (n + 1)-space by identifying antipodal points, the (n + 1)-dimensional 





WD 


al 











AN S-CONFIGURATION 501 


polytope t:8,4: (which is a truncation of the (n + 1)-dimensional crosspolytope 
Basi) will yield an n-dimensional S-configuration (§(6)). 
Here t:8,+: is denoted by 
{3 \ 
ae Ye 





© .. - or 


Thanks are due to the referee for the present form of the paper, and to 
Professor B. R. Seth for due encouragement and for providing necessary 
facilities to continue my pursuits. 


REFERENCES 


H. F. Baker, Principles of geometry, vol. 4 (Cambridge, 1940), p. 104. 

N. A. Court, Theorems, their converses and their extensions, Nat. Math. Mag., 17, 6 (1943), 
1-7. 

3. H. S. M. Coxeter, Regular honeycombs in elliptic space, Proc. Lond. Math. Soc. (3), 4 

(1954), 471-501. 


1. 
2. 


4. ———,, The polytopes with regular-prismatic vertex figures, Phil. Trans. Royal Soc. A 229 
(1931), 359-60. 

5.——., Regular polytopes (London, 1948). 

6. Sahib Ram Mandan, Umbilical projection, Proc. Ind. Acad. Scs., Sec. A, 15, 1 (1942), 
16-17. 

7. ———, Harmonic inversion (in press). 

8. ————, Altitudes of a simplex in four dimensional space (in press). 

9. ———, Mutually selfpolar pentads in S,, Panjab Uni. Res. Bul., 14 (1951), 31-32. 

10. —, On four intersecting spheres (in press). 


Indian Institute of Technology 
Kharagpur 








ON THE MEDIANS OF A TRIANGLE IN HYPERBOLIC 
GEOMETRY 


O. BOTTEMA 


1. In non-Euclidean geometry the three medians of a triangle A,A2A; 
(each joining a vertex A, with the internal midpoint G, of the opposite side) 
are concurrent; their common point is the centroid G. But the Euclidean 
theorem 


GG, _1 
AG, 3’ 
which depends on similarity, does not hold. In what follows we make some 
remarks on this ratio, restricting ourselves to hyperbolic geometry. 

In accordance with a procedure recommended by Coxeter (1, p. 229), we 
take A,A2A; as the triangle of reference for projective co-ordinates x1, X2, X3; 
the equation of the absolute conic 2 then appears in the general form. For 
our purpose we take, moreover, G as the unit-point. The equation of Q is 
now 
(1) xi + x3 + x3 + 2cosh 1. XoX3 + 2 cosh ae. X3x; + 2 cosh a;. x:xX_ = 0, 


where a; is the length of the side opposite A ,. The tangential equation of 2 
reads 
sinh*a, . ui + sinh’a,.u; + sinh’a; . uj + 2(cosh a; — cosh a2. cosh a3)Usts 
(2) + 2(cosh a2 — cosh a;. cosh a;) u3u; + 2(cosh a3; — cosh a; . cosh a2) ujte 
= 0. 
From A, being inside Q follows the inequality (1, p. 239) 
(3) y =2cosha,.cosh a2. cosh a; — cosh’a,; — cosh’a: — cosh’a; + 1 > 0, 


which is equivalent with the fact that a side of the triangle is less than the 
sum of the other two. 


2. The median A;G; has the equations x; = x. = A, x; = 1, where J is a 
parameter; for \ = ~,0, 1 we have the points G;, A;, G. The points of inter- 
section S,; and S, of the median and the absolute are given by the roots Au, 
Ae of the equation 


(4) 2\7(1 + bs) + 2(b1 + b2)A +1 =0, 


where 5b; is written for cosh a;. Both roots are negative. We put uw; = —rAy, 
ue > wi, AG, = 24, GG; = yy. Then 





Received March 3, 1958. 
502 











ON THE MEDIANS OF A TRIANGLE 503 


2, = } log (S:S2A 3G3), 3 = A log (.S;S2GG;) 


or 
2z5 _ He wy, M2 t+ 1 
(4 =“, A = ——, 
M1 mit+ 1 
Hence 
> Me — Mi ° Mem My 
sinh 23 = = » sinh y, = ——— ' 
(5) * 27 [ume] 7 2V/[ (ue + 1) (ui + 1)) 
cosh z3 = 3 + : cosh y; = =; Me + wi + 2 
2 [ume] 2V/[(u2 + 1)(ust+ 1)) 
and 
_— 2] . sinh 2; 
6 tant — wee _ [Ha] . sinh 25 . 
” 8 ua + i +2 Vue} cosh 25 + 1 
Furthermore, 
™ 1 1 
(7) Mite = Arle = — = 


2 + by) ~ Teosh Jas’ 
and so we get the following formulae: 


. . 2(b; + b2 + 63) +3 
a ee. be v 
(8) (ur + 1)\(Qu2 + 1 21 bs) , 


cosh a; + cosh a2 


¢ = : 

(9) cosh ss 2 cosh 4a; ’ 

(10) sinh ys = hes! a 
sinh 23 {2(b; + be + bs) + 3}°’ 

(11) tanh 9, = ane ___ 


cosh z3 + 2 cosh 4a; ° 
3. In (9) we have the well-known formula giving the length of a median as 
a function of the sides. From (10) it follows that 
If AG, are the medians of the triangle A,A2A3, and G is the centroid, then 
sinhGG, _ sinhGG, _ sinhGG; | 


sinh A;G, sinh AG, sinh A;G;’ 


the common value of the three ratios is {2(cosh a; + cosh az + cosh a3) + 3 A<@, 





4. From (11) it is seen that y; is a function of 2; and a; only. Therefore: 


If for the triangle A,A2A; the base A,Az and the length of the median A;G; are 
given then GG; has a fixed value. 


If for abbreviation we denote p; = 2 cosh $a,, we have (suppressing the 
index 7): 


(12) tanh y = ——__ 











504 O. BOTTEMA 


Obviously y = 0 for z = 0. Furthermore, differentiating the formula we get 


1 dy 1 + pcosh z 


cosh’y ‘dz (coshz+ p)’ 
or 


(13) dy _ * 1+ pcoshz 


dz 1+ p* + 2p cosh 


Hence dy/dz is an increasing function of z; for z = 0 we have 


] 
dz 1+)’ 
its limit for z > @ is 4. Therefore: 


If the base A,A2 = a; of the triangle is fixed, then GG;/A,;G; increases if 
A;G; increases and we have the inequality 


—— 
(14) 1 + 2 cosh 4a; * AG; < t. 


As a consequence we have for all triangles the inequality 
GG: 
A 3G3 
It follows from the proof that the limits in (14) and (15) cannot be sharpened. 


(15) 0< < 3. 


5. The Euclidean value 4 is between the limits given in (15). Therefore 
there are triangles for which 


GG; _1 
AG, 3 
If in (12) we put z = 3y, we get 
— —sinh 3y _ 
tanh y = coh ie +3" 


Substituting sinh 3y = sinh y(4 cosh*y — 1), cosh 3y = cosh y(4 cosh*y — 3), 
we get 


cosh y = $p = cosh $a 


Therefore: In the triangle A,A2A; we have 


if and only if GG; = 4A;Az2; hence in the triangle A,;GA, the angle Z A2GA\ 
is the sum of ZGA,A2 and Z24A;A,G. 


6. In such a triangle we have 


3 3 a; 
23 = ate cosh z; = cosh 2( 4 cosh’ > = 3) 

















ON THE MEDIANS OF A TRIANGLE 


and therefore, in view of (9) 


(16) 2 cosh? as + cosh a; — cosh a, — cosh a, — 1 = 0. 
More generally, if 

(17) ks = 263 + bs — b: — by — 1, 

we have 


m3 ys _ 1 Ys 
2° 3° 23 ht - 


wl 


according as k; < 0, k; = 0, ks; > O, respectively. If 6b; = b. = bs; we have 

obviously k; > 0. Hence in an equilateral triangle the ratios y,/z are less 

than 4. We define k; and k- analogously to (17). If we put c, = 5, + 1 we get 
ky = 22 + 5c, — C2 — Cs. 


Hence ky aa ke a ks = 2(c;? + C2? + C3”) “p 3(c, a Ce + C3) > 0, since Ci > 0. 
Therefore k; = k, = k; = 0 is impossible: There are no triangles for which 
the three ratios y,/z,; are 4. 

We have 
(18) ky — ko = 2c, — €2)(€, + Co + 3). 
If ki = k = 0, then c; = Co, b} = by = 5, b3; = 2b? — 1; but then ¥ is zero and 
the inequality (3) is not satisfied. Therefore, 


There are no triangles for which two ratios y;,/z, are 4. 


From (18) it follows that k; > ks inplies c; > cz (so that a; > a2) and con- 
versely. 
We have established the existence of triangles for which one of the ratios 
y,/2z, is 4. Suppose k; = 0. Then c; + co = 2c3? + 5c3. Moreover, we have 
Y = 2Wrexes + Ws(cr + C2) — (C1 + €2)* + 4e1c2 — C3 > O 
or 
CxC2(cs + 2) > 2c + Bei + Bei, 
that is, 
C1C2 > 2c3(c3 + 2). 
It follows from this that 
(c1 — ¢2)* < c5(2ce + 3)”. 
Therefore, assuming c; > ¢2, we have ¢; — cz < ¢3(2c3 + 3), and from ¢; + Ce 
= 2c;? + 5c; it follows that c; > c2 >c;. Hence 
If in a triangle . 
a 8 


then a, is smaller than each of the two other sides. 








506 O. BOTTEMA 


In view of all this we have: Jn a triangle either all three ratios y;/2, are less 
than 4 or two of them are <4 and the third (belonging to the smallest side) > }4. 


REFERENCE 


1. H. S. M. Coxeter, Non-Euclidean Geometry (3rd ed.; Toronto, 1957). 


Technische Hogeschool 
Delft, Holland 














CORRECTION TO 
““TRANSITIVITIES IN PROJECTIVE PLANES”’ 


T. G. OSTROM 


For basic definitions of terms and symbols, see (3). When we refer to 
theorems by number, it is to be understood that these are theorems of the 
basic paper.' Professor Pickert has pointed out an error in the proof of Theorem 
16 (ii). As stated, the theorem is false. Case IV of Theorem 4 shows that the 
nearfield plane of order 9 is a counter-example. The dual nearfield plane of 
order 9 is also a counter-example. 

We shall now state and prove a correct version of this Theorem. 


THEOREM 16 (ii). (Given a projective plane which is ~,-L, transitive and 
po-L2 transitive, where p, ¥ p2 and L,; ¥ Lz.) If pi ts on neither L; nor Lo, po is 
on neither L, nor Lo, and p, and ps» are not collinear with the intersection r of Ly 
with Lo, then the plane is Desarguesian unless n = 9. 


Proof: The theorem differs from the original theorem only in excepting the 
case where n = 9. The error in the original proof arose out of the assumption 
that the p,-L; and p2-L2 perspectivities generate a group which is doubly 
transitive on the points of the line p:po. If this collineation group is indeed 
doubly transitive on the points of pipe, then the original proof goes through. 
Hence we proceed to investigate the permutation group on pi)». 

Let G denote the group of collineations generated by the p;-L; perspectivi- 
ties and the p2-L2 perspectivities. Let the line pip2 be denoted by L,, and let 
LVL, = q, Le 1\L, = dq. Let G; be the permutation group on L,, in- 
duced by G. 

Now, it follows from the hypotheses that ~;, qi, P2, and g2 are four distinct 
points. If m = 3, the plane is Desarguesian. If m is greater than 3, there is 
at least one other point ¢ on L,. Under the p2-L2 perspectivities, ¢ can be carried 
into every point on L, except p2 and ge. Under the :-Z; perspectivities, ¢ 
can be carried into every point on L,, except p: or 41. 

It follows that G; is at least simply transitive on the points of L,. Let 
G,(p,) be the subgroup of G,; which fixes p;. G, will be doubly transitive if and 
only if G,(p;) is transitive on all of the points of L,, other than p;. Now the 
subgroup of G,(p:) induced by the p,-Z; perspectivities is transitive on the 





Received December 9, 1957. 


‘Although the author was not aware of this fact when (3) was written, most of the theorems 
in Part 2 are included in (4). 

The author is indebted to W. R. Cowell for checking some of the computations in this 
paper. 


507 








508 T. G. OSTROM 


points of L.. other than p; and q. Hence, a necessary condition for G, to fail 
to be doubly transitive is that q, is fixed by G,(p:). Since G; is at least simply 
transitive, we can generalize this condition so that for each p, € L,, there 
is a unique point g, € L,, such that G,(p,) fixes g,. Thus, G,(p,) is included 
in G,(q;) and is transitive on points of L,, other than p, and q,. G:(g,) must 
fix some point on L,, other than g,;. This point can be none other than p,. 
Hence G;(p,) and G;(g;) include each other, and G,(p,) = G,(q,). (The proof 
that G,(p,) = G.(¢,) was first made by the author in a form which applied 
only to finite planes; the author is indebted to Professor Pickert for pointing 
out that finiteness is not required.) 

Thus, either G, is doubly transitive (and the original proof goes through) 
or the set of points on L,, can be divided into pairs (p;, g,) such that every 
collineation of G which fixes one point of a pair also fixes the other point. 
Following Andre (1), let us call such pairs of points ‘‘admissible pairs.’’ We 
shall assume from here on that G; is not doubly transitive. 

The image of an admissible pair under any collineation of G is an admissible 
pair. Now (:, g:) is an admissible pair and the plane is p:-L; transitive, 
where L; = rq;. It follows that the plane is p,-L, transitive for each p,; € L,, 
where L, is the line rq,, since the collineation which carries p, into p,; transforms 
the p,-L; group of perspectivities into the p,-L,; group of perspectivities. 
In each case, G:(p,) is transitive on the points of L, other than q;. Thus, 
every point on L., belongs to exactly one admissible pair. This will be im- 
possible if m is even; we shall henceforth assume that is odd. 

Now the ,-L, group of perspectivities is of order n—1. Let (p,,q,) be an 
admissible pair, where i ~ j7. By the p,-L, transitive property, there is a 
perspectivity p, with centre p, and axis L; which carries p, into q,. But the 
image of an admissible pair must be an admissible pair; the collineation which 
carries p, into g, must carry q, into p,. Thus p, is of order two. The roles of 
Pp; and gq, are interchangeable; thus, there is a perspectivity o,; of order two 
with g, as centre and py as axis. The product of two perspectivities of order 
two in which the centre of each is on the axis of the other is a perspectivity of 
order two which fixes all of the points on the line of centres. (2, Lemma 6) 
Hence, every perspectivity of order two with centre p; and axis L,; produces 
the same permutation of points on L,, as does o;. 

We had previously established that, for every admissible pair (p,, g,) there 
was a perspectivity of order two with centre p, and axis L,; (i # j) which 
interchanged p, with g,;. The uniqueness property just established then implies 
that p, interchanges the points of every admissible pair except p; and q;. 
In other words, for each admissible pair (p;, g;) the perspectivity of order 
two with centre p; and axis L, interchanges the points within each admissible 
pair other than the pair (;, q;). 

Now let us set up a co-ordinate system. Take the point r as the origin 0, 
and choose some admissible pair as the points A and B (the centres of the 
pencils x = constant and y = constant, respectively). It can be readily 





a wd ae wm a @& 











TRANSITIVITIES IN PROJECTIVE PLANES 509 





verified that the perspectivity with A as centre and the line y = 0 as axis 
which carries the point (1, 1) into (1, a) also carries (c,d) into (c, da) and 
(m) into (ma), where (c, d) represents any point not on L,,, and (m) represents 
the common point on L,, for all lines of slope m. Likewise, the perspectivity 
with B as centre and the line x = 0 as axis which carries (1, 1) into (a, 1) also 
carries (c, d) into (ca, d) and (m) into (am). 

The co-ordinate system will then have the following properties: 

(i) The co-ordinatisation is linear. 

(ii) Multiplication is associative. 

(iii) (c + bla = ca + ba. 

Properties (i) and (ii) follow from Theorem 6. Property (iii) follows from 
an argument similar to that used in Theorem 15. 

The uniqueness property of involutions on L,, implies that there is exactly 
one element 7 of multiplicative order two. Consider the following two perspec- 
tivities: 

p: (c, d) — (c, di), (m) — (mi) 

a: (c,d) — (ct, d), (m) — (im). 
The image of (m) under po will be (imi). But, as previously remarked, po is 
a perspectivity of order two fixing every point on L,. Thus, m = imi, and i 
commutes with every element in the multiplicative group. 


(iv) There is a unique element i of multiplicative order two, and im = mi for 
every m. 


Now multiplication by i must interchange the points within each admissible 
pair except the pair (A, B). Hence, for each (m), (m) and (mi) are the points 
of an admissible pair. 

Let us consider the perspectivity of order two with axis y = x, centre (7). 
We will have: 

AB 

(c, c) is fixed 

x=cey=c 

(c, d) +» (d, c) 

(0, b) <> (6,0). 
The point (1) is fixed and, since (0,5) € y = x + b, (6,0) must be on the 
image of y = x + b. Hence 


yextbeoy=xt (—b), where b + (—5b) = 0. 


Moreover, (c,c + 6) + (c + b,c) so that (c + 6,c) must be on the line 
y = x + (—5). This implies 


(v) (c + b) + (—b) = c, where 6 + (—d) = O. 











510 T. G. OSTROM 


Also, the fact that (1, m) <> (m, 1) implies that lines of slope (m) go into 
lines of slope (m~'). But our collineation must interchange the points of 
admissible pairs. Hence mi = m~ and 


(vi) m?*=1%1 for m # 1,1,0. 


Next, we shall establish that i must be —1. We shall then show that 1 + 1 
= —1, and, finally, that nm = 9. In what follows, we have obtained a number 
of very helpful ideas from (1). (The reader should note the use of parentheses 
in the equations on one hand, and the indication of points on L,, by a single 
element within parentheses. ) 


It follows from the right distributive law that (—1)a = —a, that is, that 
a + (—1)a = 0 for every a in the co-ordinate system. Moreover, it follows 
from (v) that (—a + a) + (—a) = —a and hence, —a +a = 0. 

In particular, —i+ i= 0. But 0 = —i+ (—1)(—* = -i+ (-D%= 
—it+2 = —i+1 (unless —1 = i). This implies that i = 1. Since i was 
of multiplicative order two, we have a contradiction unless i = —1. 

Thus, we have established that i = —1, and —1 has the following special 
properties: 

(vii) (-1)?=1, (-—ld=4(-1), B= —1Lifb #0, +1. 
Furthermore, if a, 6,ab # +1, (ab)? = —1, a“! = —a, b-' = —b. Hence, 


ab = —(—b)(—a) = —ba. 
We can now characterize the admissible pairs other than A and B as pairs 
(m) and (—m). 


Now, (1 + 1)? = (1+ 1) + (1 + 1). But, either 1 + 1 = —1 or (1 + 1)? 
= —1. Thus, either 1 + 1 = —1 or (1+ 1) + (141) = —1. 
Let us assume, for the moment, that (1 + 1) + (1+ 1) = —1. The 


points (1) and (—1) form an admissible pair. Hence there is a perspectivity 
with axis y = x and centre (—1) which carries A into the point (1 + a), B 
into (—1—a), where a may be any element of the co-ordinate system such 
that 1 + a #0, +1. (The existence of this perspectivity follows from the 
fact that the plane was p,; — L;, transitive for each p, € L,, and that the image 
of an admissible pair must be an admissible pair.) 

The point (1, 1) is fixed under this perspectivity. Hence, the line x = 1 maps 
into the line of slope (1 + a) which goes through (1, 1). It is readily verified 
that this line has the equation y = x(1 + a) — a. The line y = 0 will map 


into the line y = —x(1 + a). Hence (1,0) must map into the intersection of 
y = x(1 + a)—a and y = —x(1 +a). 

Moreover, every line of slope —1 is fixed. In particular, the line y = —x + 1 
is fixed. The image of (1, 0) must also be on this line. 

Now (—1, 1 + 1) satisfies the equations y = —x + land y = —x(1 + 1). 


In the particular case where a = 1, we have that (1,0) must map into 
(—1, 1+ 1); it follows that (—1, 1+ 1) must satisfy the equation 
y = x(1 + 1) — 1. That is: 








~~ @. th tt.» sae 






ito 








TRANSITIVITIES IN PROJECTIVE PLANES 


L+1 @ (—1 — 1) = 1. 


and 

c+c=(-—c—c) —c for every c. 
Using the fact that (1 +a)? = —1, it follows that x = (a+ a)(1 +a), 
y = a + a, are the simultaneous solutions of the equations y = x(1 + a) — a 
and y = —x(1 + a). This pair of values for x and y are the co-ordinates of 


the image of (1,0) under the perspectivity with axis y = x, centre (—1) 
which carries A into (1 + a). 


But this pair of values for x and y must also satisfy the equation y = —x + 1 
and 
a+a= —(a+a)(l+a)+1 ifl+a+0,+1 


=(l+a)(a+a)+1lifa+ax¥x++1,4+(1+a)andi+axi1 
= ((@+a)+a(a+a))/+1 
=[(+a)—-(a+a)al+lax#x#itlat+axtla(fat+a)¥+I1 
=(@+a)+ (1+1))+1, ax~+1,0. 


This last equation, and the right inverse law for addition, imply that 1 + 1 = 
—1, unless the only values of a that can occur are those included in the excep- 
tions noted. Re-examining the exceptions, we find that there are at most 
six distinct cases: a = +1, a = +(1+ 1), a = O and the value of @ such 
that 1 + a = —1. That is, the assumption that 1 + 1 # —1 leads to the 
conclusion that 1 + 1 = —1 if our co-ordinate system contains more than 
six distinct elements. Since all planes of order 8 or less are Desarguesian, we 
can without loss of generality assume that our co-ordinate system contains 
at least nine distinct elements. 

Thus we can, without loss of generality, assume that 1 + 1 = —1 and, 
multiplying on the right, c + c = —c, for every c. 

Again consider the perspectivity with axis y = x, centre (—1) which carries 
A into (1 + a), B into (—1—a), where now a is to be fixed but a # 0, +1. 
As before, the point (c, c) is fixed, and the line x = c maps into the line of 
slope (1 + a) which goes through (c, c), that is, 


x=c—y=x(1+a)+c*, where c=c(l+a)+c*. 


Also, y = 0O— y = —x(1 +) and y = —x + is fixed. The simultaneous 

solution of the equations y = x(1 + a) + c*, y = —x(1 + a) is readily veri- 

fied to be x = —c*(1 +a), y = —c*, using (1+ a)? = —1,A*# +c = —c*. 

This pair of values of x and y must satisfy the equation y = —x + c. Hence 
—c=c(1l+a)+ec. 

Now, if c*¥ + 0, +1, +(1 + a), this can be written 


—c* = —(l+a)* +c = (—c* — ac*) +. 








512 T. G. OSTROM 


This implies that c = ac* and —ac = c* provided that c* ¥ 0, +1, +(1 + a). 
(Recall that a ~ 0, +1.) If we substitute c* = —ac into c = c(1 +a) +’, 
we get 

c = c(l + a) — ae. 


If c #0, +1, +(1 + a), this may be written 
c= —(1 + ajc — ac = (—c — ac) — ace. 
Adding ac to both sides and using the right inverse law, 
¢c+ac= —(c+ ac). 


But, since — 1 is of multiplicative order two, —1 # land c + ac # —(c + ac) 
unless c + ac = 0; that is, (1 + a)c = 0. With a # —1, this implies that 
c=0. 

Thus, if c* #0, +1, +(1 +a), the only possible values of c are c = 0, 
+1, +(1 +) and, for these values of c, c* = —ac. We have only nine 
distinct possible values for c*: 


0, +1, +(1 + a), —a(+1), —a(1 + a), and —a(—1 — a). 
But there is a value of c* for each value of c and c*; = c*: if and only if c¢, = cs. 
Hence, our co-ordinate system contains only nine distinct elements, and 


n = 9. Thus, the assumption that G, is not doubly transitive and the plane is 
not Desarguesian lead to the conclusion that m = 9 and the theorem is proved. 


REFERENCES 


1. J. Andre, Projektive Ebenen ueber Fastkérpern, Math. Z., 62 (1955), 137-160. 
2. T. G. Ostrom, Double transitivity in finite projective planes, Can. J. Math., 8 (1956), 563- 
567. 





3. Transitivities in projective planes, Can. J. Math., 9 (1957), 389-399. 
4. G. Pickert, Projektive Ebenen (Berlin, 1955). 


Montana State University 




















ON THE NUMBER OF DISSIMILAR GRAPHS BETWEEN 
A GIVEN GRAPH-SUBGRAPH PAIR 


FRANK HARARY 


The purpose of this paper is to integrate the theorems on enumerating 
subgraphs and supergraphs in (2) and (3) respectively by generalizing to a 
result which includes both of these as special cases. In this process we again 
utilize the powerful enumeration method of Pélya (4). 

A graph may be defined as a set of p points together with a prescribed 
subset of the $p(p — 1) lines joining pairs of distinct points. Two points of 
a graph are adjacent if there is a line joining them. Two graphs are isomorphic 
if there is a one-to-one correspondence between their point sets which pre- 
serves adjacency. An automorphism of a graph G is an isomorphism of G 
with G. It is well known that the set ['(G) of all automorphisms of a graph 
G forms a group, called the group of the graph. The line group T,(G) of a 
graph G has been defined in (2) as the permutation group acting on the lines 
of G which is induced by the automorphisms of G. Two points of a graph 
are similar if there is an automorphism mapping one onto the other. Similarity 
of two lines or of two subgraphs is defined analogously. 

The complement G’ of a graph G is that graph whose point set coincides 
with that of G and in which two points are adjacent whenever they are not 
adjacent in G. Let K, be the complete graph of p points, that is, the graph 
of p points in which every two distinct points are adjacent. Then K,’ is the 
totally disconnected graph with p points and no lines. Let K,,,, be the graph 
of m + n points ai, d2,..., Gm, 51, b2,... , 5, and all mn lines of the form a,b,. 

A spanning subgraph (called a line-subgraph in (2)) of a graph G is a 
subgraph of G with the same point set. The main result of (2) is an enumera- 
tion formula for the number of dissimilar spanning subgraphs of a given 
graph G. If G has p points, this result may be described as the number of 
dissimilar graphs between G and K,’. The number of dissimilar supergraphs 
of G was obtained in (3). This is, of course, the number of dissimilar graphs 
between K, and G. Even the generating function of (1) whose coefficients for 
a given value of p are the numbers of non-isomorphic graphs of p points and 
q lines may be regarded as enumerating the dissimilar graphs between K, 
and K,’. More recently a formula (to appear elsewhere) has been obtained 
which enumerates bicoloured graphs essentially as the dissimilar graphs 
between K,,, and K,’ for'm + n = p. 

We wish to derive here a generating function for the number of dissimilar 


Received November 14, 1957. This work was supported by a grant from the National 
Science Foundation. 


513 








514 FRANK HARARY 


graphs between any given graph G and a given spanning subgraph H. Let 
q and gq; be the number of lines of G and H respectively. We denote this 
function 


(1) Fo y(X) = do + aux + ax? ++... + x" 


where r = g — q: and a, is the number of dissimilar graphs K between G 
and H having k + q; lines. 

We now require Pélya’s Theorem precisely in the one variable form in 
(1, p. 447). Since the definitions appear there we include only a statement 
of the theorem here. 


P6LYA’s THEOREM. The configuration counting series F(x) is obtained by 
substituting the figure counting series f(x) into the cycle index Z(T) of the con- 
figuration group VT. Symbolically, 


(2) F(x) = Z(T, f(x)). 


This theorem reduces the problem of finding the configuration counting 
series to the determination of the figure counting series and the cycle index 
of the configuration group. The desired configuration counting series in the 
present context is the function Fg ”(x) of equation (1). 

Let L and L; be the line sets of G and H respectively. Let G — H be the 
spanning subgraph of G whose line set is L — L;. Then the figures are the 
pairs of points of G adjacent in G — H. In a configuration K, that is, a graph 
between G and H, the content of a given figure is 1 if the points of the pair 
are adjacent in K and is 0 otherwise. Hence the figure counting series is 


(3) f(x) = 14+. 


The appropriate configuration group for this problem is the permutation 
group which acts on L — L;, described as follows. Consider the subgroup of 
l',(G), the line group of G, which leaves the set L; invariant. If in this sub- 
group we cut down the object set from LZ to L — L;, which can be done 
since L, is invariant, we obtain the required configuration group. We denote 
this group by T1(G — H\H) to indicate that it is the line group of G — H 
subject to the auxiliary condition that H is left invariant. On substituting 
these observations into equation (2), we obtain the formula: 


(4) Foa(x) = Z(T\(G — HH), 14+ x). 


THEOREM. The generating function for the number of dissimilar graphs be- 
tween G and H is given by equation (4). 


For purposes of clarity, we illustrate this theorem using two examples 


other than those in (2) and (3). 


Example 1. Let G = K33, and let H = Cy, a cycle of length 6. Then 
l'i(G — H\H) = S;, the symmetric group of degree 3. Since it is well known 





~~ rf~zaA -«* fF * bee 


hen 











THE NUMBER OF DISSIMILAR GRAPHS 


OB DS 


Ficure 1 


that Z(S,,1+x) =1+x+x°+...+ 2", it follows that Fog,(x) = 
1+x+x*+ x*. This counting polynomial is illustrated in Figure 1, in 
which the first graph is Cs and the'last is K,3. 


Example 2. Let G = Q;, the graph of the three dimensional cube, and let 
H = Cs, any Hamiltonian circuit of Q;. (A Hamiltonian or complete circuit 
of a graph is one containing all its points. All complete circuits of Q; are 
similar.) Then one readily sees that T',;(Q; — Cs/Cs) is the direct product of 
two copies of the permutation group S:, denoted S, X S: or more briefly 
S:?. Since Z(S:) = $(f:? + fe) and Z(S K T) = Z(S)-Z(T) for any per- 
mutation groups S and 7, we have Z(S,”) = }(f:* + 2f1°fe + f2*) 

Hence 


Z(Ti1(Qs — Ce/Cs), 1+ x) = 3[(1 + x)* + 21 + x)*(1 + x?) + (1 + x*)?] 


so that 
Fo,,c,(x) = 1 + 2x + 3x* + 2x? + xt. 


This generating function enumerates the graphs of Figure 2, in which C, 
appears first and Q,; last. 


TOr@=< 






































FiGurE 2 


These considerations leave open the subtle problem of enumerating the 
dissimilar Hamiltonian circuits in an n-cube. 








516 FRANK HARARY 


REFERENCES 


1. F. Harary, The number of linear, directed, rooted and connected graphs, Trans. Amer. Math. 
Soc., 78 (1955), 445-463. 








2. , On the number of dissimilar line-subgraphs of a given graph, Pacific J. Math., 6 
(1956), 57-64. 

3. , The number of dissimilar supergraphs of a linear graph, Pacific J. Math., 7 (1957), 
903-911. 


4. G. Pélya, Kombinatorische Anzohlbestimmungen fiir Gruppen, Graphen und chemische 
Verbindungen, Acta Math, 68 (1937), 145-254. 


University of Michigan 
and 
The Institute for Advanced Study 

















COVERINGS OF BIPARTITE GRAPHS 
A. L. DULMAGE anp N. S. MENDELSOHN 


1. Introduction and summary. For the purpose of analysing bipartite 
graphs (hereinafter called simply graphs) the concept of an exterior covering 
is introduced. In terms of this concept it is possible in a natural way to decom- 
pose any graph into two parts, an inadmissible part and a core. It is also 
possible to decompose the core into irreducible parts and thus obtain a canoni- 
cal reduction of the graph. The concept of irreducibility is very easily and 
naturally expressed in terms of exterior coverings. The role of the inadmissible 
edges of a graph is to obstruct certain natural coverings of the graph. 

Ore (7) has studied graphs using the notion of a set of maximal deficiency. 
For finite graphs a set of maximal deficiency in Ore’s sense becomes the 
complement of a first member of a minimal exterior pair as defined by us. 
Because of this, a number of theorems obtained by us become equivalent to 
theorems of Ore when the graph is finite. For infinite graphs the situation is 
quite different since Ore’s finiteness condition and ours can never be satisfied 
simultaneously. 

Amongst the theorems obtained are generalizations of results due to Kénig 
(5) which may be interpreted as theorems in distinct representatives of sets. 
In the sixth section inequalities are obtained connecting the dimension of a 
graph with certain simple parameters obtained from a matrix representation. 
These results are continuations of those obtained by the authors in (2). Results 
of this type are of importance from the computational aspect and are connected 
with the theory of games through the optimal assignment problem as shown 
in von Neumann (9) and Dulmage and Halperin (4). 


2. Notation. Throughout this paper the following notation is used: S and 
T represent two arbitrary sets, and S X T their Cartesian product consisting 
of pairs (s,¢#) with s € S,¢t € T. Any subset K of S X T is called a graph, and 
its elements (s,¢) are called edges. A, A;, S;, A*, Ax are subsets of S and 
B, B,, T,, B*, Bs are subsets of T. A,\U Ay, Ay \ A; and A, represent union, 
intersection, and complement (with respect to S). The null set is denoted by 
¢. If a set A, contains a finite number n of elements, 1 is called the order of 
A, and this is denoted by »(A,) = n; otherwise, v(A,) = @. If both S and 
T have a finite or countable number of elements ordered as 5), Se, Ss, . . 
and ¢;, te, t3,...,and K is any graph, a standard matrix representation for 
K is defined as follows: the entry a;,; = 1 if (s;, t;) is an edge of K, otherwise 
a;; = 0. It will also be convenient to represent K by a more general matrix 





Received January 20, 1958. 








518 A. L. DULMAGE AND N. S. MENDELSSOHN 


representation in which entries are non-negative real numbers having the 
property a,, > 0 if (s;, ¢,) € K, otherwise a,, = 0. 


3. The covering theorems. Let K be any graph. A pair of sets [A, B], 
is an exterior cover (or simply cover) for K if for each (s,#) € K, s€ A or 
t € B (or both). Otherwise stated, [A, B] is an exterior cover for K if 


KC(AXB)U(A XB)U(A X B) = (A XT) U (SX B). 


Thus K (\ (A X B) = ¢ if and only if [A, B] covers K. The number »(A) 
+ »(B) is called the dimension of the covering and K is said to be of finite 
exterior dimension if there is a covering [A, B] such that »(A) + »(B) is 
finite; otherwise K is of infinite exterior dimension. A graph K consisting of 
an infinite number of edges may be of finite exterior dimension. 

The exterior dimension E(K) of K is defined as E(K) = min(v(A) + »(B)), 
the minimum being taken over all exterior pairs [A, B] which cover K. An 
exterior pair [A, B] for which the minimum E(K) is achieved is called a 
minimal exterior pair, abbreviated m.e.p. 

Another concept of importance is that of a disjoint graph K*. The graph 
K* is said to be disjoint if for every two distinct edges (s;, t:), (se, te) of K*, 
$1 ¥ So and t,; ¥ ty. It is obvious that a disjoint graph KA* of finite exterior 
dimension E(K*) contains exactly E(K*) edges, and conversely. 


THEOREM 1. If K is a graph of infinite exterior dimension, then K contains 
an infinite disjoint subgraph K*. 

Proof. It is sufficient to show that to any disjoint subgraph K* of exterior 
dimension , there is at least one edge in K, which when added to K* yields a 
disjoint subgraph of exterior dimension m+ 1. Let (s:,t:), (Se, ts),..., 
(sn, tn) be the edges of K*. Let A = {5:, 52,..., Sa} and B = {f;, ts,..., ts}. 
Since K is not of finite exterior dimension [A, B] does not cover K. Hence, 
K contains an edge (5,41, 4,41) which is contained in A x B. This edge when 
added to K* yields the required subgraph. 


THEOREM 2. If K is a graph of finite exterior dimension then K contains a 
disjoint subgraph K* such that E(K) = E(K*). 

Proof. The proof is by induction on E(K). If E(K) = 1, then K is non-null 
and any edge of K may be used as K*. 


To establish the theorem for any E(K), we distinguish two cases. 

In the first case, there exists an m.e.p. [A, B] such that neither A nor B 
is null. Let »v(A) = u, »(B) =v so that u+v= E(K). Put K, = 
K C\(A X B). [A, 4] is an exterior pair for K, of dimension u. The pair 
[A, ¢] is an m.e.p. for, if not, K,; may be covered by a pair 


[Ax:, Bx,]), with v(Ax;) + v(Bx;) = p < u. 
But then 








a ee ee er ee ad 








COVERINGS OF BIPARTITE GRAPHS 519 


[Ax,, B U Bx,] 


covers K, and its dimension is p + v < E(K) which contradicts the fact that 
[A, B] is an m.e.p. for K. By the induction hypothesis since u < E(K) there 
exists a disjoint graph K*, consisting of u edges of K, such E(K*,;) = E(K,) = 
u. Similarly, if Ke = K 7) (A X B), there exists a disjoint graph K*, consist- 
ing of v edges of K. such that E(K*,) = E(K,) = v. Putting K* =K*, U K*, 
it follows that K* is a disjoint subgraph of K such that E(K*) = E(K). 

In the second case, if [A, B] is an m.e.p. then A = ¢ or B = ¢. Suppose, 
for definiteness, that B = ¢. Then K CA X T. Let (s,t) be any edge of 
K and lett L = Kf\((A —s) X (T —2)). Since [A, 4] is an m.e.p. for 
K, E(K) = v(A). Since [A — s, ¢] covers L, 

E(L) < (A — s) + »(¢) = E(K) — 1. 

If E(L) = E(K) — 1, then by the inductive assumption there exists a disjoint 
subgraph L* of L such that E(L*) = E(L) = E(K) —1. Putting K* = 
L* U (s, t) it follows that K* is a disjoint subgraph of K such that E(K*) = 
E(K). If E(L) < E(K) —1, let [A,, Bz] be an m.e.p. for L. The pair 
[A, Us, B, Ut] is a covering for K of dimension < E(K). Thus K has an 
m.e.p. in which neither set is null, which contradicts the hypothesis of the 
second case. 

The concept of a disjoint graph has been given two interpretations in the 
literature in connection with the matrix representations of a graph. The authors 
have in (2) introduced the notion of a sub-permutation set of places in a 
matrix as a set of places which contains at most one place in any row or 
column of the matrix. It is clear that in any matrix representation of a disjoint 
graph the non-zero entries occupy a sub-permutation set of places. Ore (7) 
has defined the term rank p of a matrix A, to be the order of the greatest 
minor in A with a non-zero term in its determinant expansion. Theorem 2, 
then, states that E(K) is equal to the term rank of any matrix representation 
of K. 

An edge of a graph K is said to be inadmissible if it is not an edge of any 
disjoint subgraph K* such that E(K*) = E(K); otherwise the edge is ad- 
missible. By the proof of Theorem 1 a graph K of infinite exterior dimension 
does not have inadmissible edges. It is clear that on removing any or all in- 
admissible edges from a graph leaves a new graph with the same admissible 
subset of edges. If K is any graph, the subset K, consisting of all admissible 
edges of K is called the core of K. 


THEOREM 3. An edge of K is inadmissible if and only if it is in the union of 
all the sets A X B such that [A, B] is an m.e.p. for K. 


Proof. Let [A, B] be an m.e.p. for K and let (s;, t:) € K be an element of 
A X B. Let E(K) = g and let K’ be any subgraph of K containing (s;, ¢;) and 
having exactly g edges. Explicitly let K’ consist of the edges (s:, ti), (S2,42),---, 
(Sq, tg). Let (A) = u, »(B) = v, where u + v = g. Since [A, B] is an exterior 











520 A. L. DULMAGE AND N. S. MENDELSOHN 


cover for K’, either s;€ A or ¢; € B for 1 = 2,3,...,g¢. Also 5s: € A and 
t, € B. Thus there are at least g + 1 elements belonging to A or B. Since 
the totality of elements belonging to A or B is q, either two of 5), s2,..., 5, 
or two of ty, te,...,t, are equal. If, for definiteness, two of s:, 52,..., S, are 
equal and A’ = {5), s2,...,5,}, »(A") < g — 1 and [A’, ¢] covers K’. Hence 
E(K") < qso that (s;, t:) is inadmissible. 

Conversely, if (s,#) € K does not belong to A X B for any m.e.p. [A, B] 
for K, it will be shown that (s, #) lies in a disjoint subgraph K* for which 
E(K*) = E(K). Let L = K(\ ((S — s) X (T —2)). Clearly E(L) < E(K). 
If E(L) = E(K) — 1, let L* be a disjoint subgraph of L such that E(L*) = 
E(L). Then K* = L* U (s, #) is a disjoint subgraph of K, such that E(K*) 
= E(K). If E(L) < E(®) — 2, let [A,, B,] be an m.e.p. for L. Then 
[A, Us, B, Ut] is an exterior cover for K [ dimension < E(K). Hence 
[Az Us, B, Ut] is an m.e.p. for K, and (s,t) € (A, Us) K (B, UD. 
This gives the required contradiction. 

Complementary to the concept of an exterior pair for a graph K is that of 
an interior pair for which the following is a definition. A pair {A, B} where A 
and B are non-null subsets of S and T respectively is said to be an interior 
pair for a graph K if (A X B) C K. From the definition it follows that if 
[A, B] is an exterior cover for K such that A # S and B # T then {A, B} 
is an interior pair for K (the complement of K in SX T). Conversely, if {A, B} 
is an interior pair for K, then [A, B] is an exterior cover for K. For any graph 
K an interior dimension I(K) is defined by J(K) = max (v(A) + v(B)) 
where the maximum is taken over all interior pairs {A, B} for K. A pair 
{A,B} for which the maximum value J(K) is achieved is called a maximal 
interior pair. The number »(A) + »(B) is called the dimension of the pair 
{A, B} regardless of whether {A, B} is a maximal interior pair. If a graph K 
has interior pairs {A, B} of arbitrarily large dimension we put J(K) = @. 
Note that there is no necessary connection between the magnitudes of the 
dimensions 7(K) and E(K). Either may be greater than, equal to, or less than 
the other and either may be infinite while the other remains finite. There is a 
duality theorem connecting the exterior dimension of a graph with the interior 
dimension of its complement (provided both are finite) which is now given. 


THEOREM 4. Let »(S) = p, »o(T) = 9, Pp <q. If K is a graph for which 
E(K) < p, then E(K) + 1(K) = p+ 4. If E(K) = p, then E(K) + I(R) 
< p+ q, the equality sign holding if and only if K has an m.e.p. |A, B| for 
which A # Sand B # T. 

Proof. Suppose E(K) < p and let [A, B] be an m.e.p. for K. Then A # S 
and B # T so that {A, B} is an interior pair for K. Hence I(K) > v(A) + 
v(B) = p — v(A) +q—v(B) = p+ 4q — E(K). Hence 1(K) + E(K) > p+ 
q. Now let {A;, B,} be a maximal interior pair for RK. Then I(K) = »(A) + 
v(B,). Furthermore [A,, B,] is an exterior cover for K so that 


E(K) < »(A;) + »(B,) = p — »(As) + q — (BB) = 0 + ¢ — IR). 














COVERINGS OF BIPARTITE GRAPHS 521 


Hence E(K) + I(K) < p+. This together with the previous inequality 
yields E(K) + I(R) = p+ gq. 

If now E(K) = p and there is an m.e.p. [A, B] such that A # S, B = T 
then the above proof is valid and E(K) # I(K) = » +4. On the other 
hand, if [A, B] is an m.e.p. for K implies A = S or B = T, thein either K 
has no interior pair in which case E(K) = I(K) = p < » + or forany interior 
pair {A;, Bi} for K, (A, By] is an exterior cover for K with A, # S and 
B, # T. Hence, [A;, B,] is not an m.e.p. for K, which implies 


E(K) < »(A;) + »(B:) = p — (A) + ¢ — (Bi) = 9 + ¢ — TR). 


Hence E(K) + I(K) < p+ q. 

Theorem 4 has the following interpretation for matrices. Let M be a p 
by gq matrix of term rank p. If p < » < gq then M contains a u by » block of 
zeros and p+u+v=p+4q. If p = p <q, then for any block of zeros of 
size u by vin M, p+u+u0<p+q. In $4 we shall return to the matrix 
interpretation of the graphical theorems. 


4. The canonical decomposition of graphs. A graph K is said to be 
irreducible if for every m.e.p. [A, B] for K, either A = ¢ or B = ¢; otherwise 
K is reducible. It is clear that an irreducible graph has no inadmissible edges. 
In this section the decompositions of reducible graphs of finite exterior dimen- 
sion is considered. 


THEOREM 5. If [A, B,] and [Ao, Bz] are m.e.p.’s for a graph K of finite ex- 
terior dimension then [|A,; (\ Ax, B; UV Bz] and [A; \U As, Bi (\ Bo] are both 
m.e.p.'s for K. 


Proof. Let (s,t) be any edge of K. Then s € A; or ¢€ B, and s€ Az 
or ¢€ Bs. If s € (A,\V Ae) then s € A, and s€ Az so that ¢€ B,; and 
t€ By. Hence [A; U Ao, Bi (\ Bs] is an exterior cover for K. Similarly, 
[A, (\ Aa, By U B;] is an exterior cover for K. Now 


E(K) = v(A,) + v(Bi) = v(A2) + v(B,). 
Since [A; (\ Ao, B; U B2] covers K, 


E(K) < v(A,yC\ Ae) + vo (Bi U Bo) 
= v(A,(\ Az) + v(Bi) + o(B2) — o (Bi 0) B). 


Since [A; U As, B; (\ B2] covers K, it follows that 
E(K) < »(A,U As) + » (Bi O Bz) = v(Ay) + v(A2) — »(AL OV Aa) + (Bi OVB2). 
Both equalities must hold for, if not, we have 

2E(K) < v(Ax) + »(Bi) + (Az) + »(B2) = 2E(K), 


acontradiction. Thus [A; (\ As, By U B2] and [A; U Ao, Bi (\ Be] are m.e.p.’s 
for K. 








522 A. L. DULMAGE AND N. S. MENDELSOHN 


THEOREM 6. Jf [Ai, B:] and [A2, Bz] are m.e.p.’s for K and if A; CA; 
then B, ~ B,. 


Proof. Suppose A; C Az and b: € By but b, € B;. There must exist an 
edge (a2, b:) of K such that a, € Az; for if there is no such element then 
[As, Bz — be] is an exterior cover of smaller dimension than that of [A2, B,). 
Since a, € Az and A; C Az, then az € A;. Since b, € B; the pair [A,, B,] 
does not cover the edge (a2, 52), a contradiction. 


Coro.uary. Jf [A;, B,] and [A>o, Bz] are m.e.p.’s for K and if A; is a proper 
subset of Az or if A, = >, then B, is a proper subset of B, or B, = ¢. Also if 
[A, B,] and [A, Bz] are m.e.p.’s for K, then B, = Bz. 


THEOREM 7. For any graph K of finite exterior dimension there exist uniquely 
determined m.e.p.'s [|Ax, B*| and [A*, By] such that if [A, B] is any other m.e.p., 
then 

(i) A» is a proper subset of A or Ay = ¢, 
(ii) A is a proper subset of A*, 
(iii) Bs is a proper subset of B or By = ¢, 
(iv) B is a proper subset of B*. 


Proof. lf K has an m.e.p. [As, B*] where Ay = ¢, then (i) and (iv) hold 
for any m.e.p. [A, B] and [A«, B*] is the unique m.e.p. for K with this property. 
If there is no m.e.p. for K whose first member is null, let [Ay, B*] be an 
m.e.p. for K for which A, contains the smallest number of elements of all 
first members of m.e.p.’s for K. A» is uniquely determined, since if Ao is the 
first member of an m.e.p. for K which contains the same number of elements 
as does Ax, then by Theorem 5, Ao (\ Ax is the first member of an m.e.p. for 
K. Hence if Ap # Ax, the set Ay (\ Ax would have fewer members than Ax, 
a contradiction. Since A» is uniquely determined, the corollary to Theorem 6 
shows that B* is also uniquely determined. Let [A, B] be any other m.e.p. 
for K. [Ag (\ A, B* U B] is an m.e.p. so that Ay \ A = Ax. This implies 
B* UV B = B*. Hence As CA and BC B*. Both these inequalities are 
proper, otherwise Ax = A and B* = B which contradicts the assumption 
that [A, B] is different from [As, B*]. Similarly, there is an m.e.p. [A*, Bs] 
for which (ii) and (iii) hold. 

From the above proof it is seen that the sets Ay, A*, Bs, B* are definable 
as follows: As = f1A, A* = UA, Bs = 1B, B* = UB where A ranges 
over all first members and B ranges over all second members of m.e.p.’s for 
K. The pairs [Ax, B*] and [A*, By] will be referred to as the extreme m.e.p.'s 
for K. 

Let [Ax«, B*] and [A*, By] be the extreme m.e.p.’s for a reducible graph K 
of finite exterior dimension E(K). If A* = Ax the Cartesian product S X T 
is divided into three parts R; = (A+ X Bt) U (Ag & B*), Ro = Ax X B*, 
and R; = As X B*. On the other hand, if »(A*) — v(Ax) > 0, there is at 
least one non-null set A such that A (\ Ay = ¢, A U Ax is the first member 





~— ee ee 








COVERINGS OF BIPARTITE GRAPHS 523 


of an m.e.p. for K and such that »(A) is minimal. Let u,; = »(A) and let S, 
be a particular set (possibly unique) amongst all such A. Put A; = As U Sj. 
Let B, be the uniquely determined second member such that [A,, B,] is an 
m.e.p. for K and let 7, = B* — B,;. Now 


v(7,) = v(B*) — »(B,) = {E(K) — v(As)} — [E(K) — o(A))} 
= v(A)) a v(A¢) = v(S;) = 4}. 


Further, S; and 7; are constructed inductively as follows. Provided 
y(A*) — vo(Ag US, U So,...,\.U Sis) > O there exists at least one non- 
null set A such that »(A) is minimal, A (\ (Ag US,...\U Sy) = @ and 
AyUSiU S:...U Si1 UA is the first member of an m.e.p. for K. Let 
S, be any particular set which satisfies these requirements on A and put 
v(S,) = uy. PutA, = Ae US,US2...U Sis U S,and let B, be the unique- 
ly determined set such that [A ,, B,] is an m.e.p. for K. Let T, = By, — By. 
As before, »(T,) = v(S,) = uy. 

The process stops when Ay US; U S:....U S = A*. Thus S = Ay US 
US:...US,U A*. This decomposition of S into k + 2 disjoint subsets 
is the canonical decomposition with respect to the reducible graph K of finite 
exterior dimension. T = B\U 7, T:...U T, U By is the canonical de- 
composition of 7. We have: 


S,f\ Ae = @ for: = 1,2,...,k; 
SiN S; = ¢@ for all i, j,i # j; 
T.0\ Be = fori = 1,2,...,k; 


T,1\T; = @ for all i, j,i ¥ j; 
v(S;) = v(T;) = Ui; 

k 
E(K) = v(Aw) + (Be) + DO uy; 


i=l 


[A ,, B,] isan m.e.p. for K,i = 1, 2,...k, 
where A, = Asx U Si U ee S; 
and By = TuiVU Tus..-U 7, U Be. 


= 
] 


= (As X Bt) U (S; X Ti) U (S2 X Tx)... U (Sz X Tx) U (A* X Be); 
(de X B*) U (A* X Be) U (Se X T)); 
i<j 


R; = (As X B*) U (A* X Be) U (5: X T;): 
>j 


2 
] 


Ri, Ro, Rs are disjoint and R;\U RU R; = S X T. 

In the following figure, this decomposition is shown in the case where it is 
assumed that the elements of S are ordered so that the points of As come first, 
followed by those of S;, S2, S3,...,S, and finally A*, while those of T are 
ordered By, 7, Ty-1,... 73, B*. In this representation R» appears in the 
upper left corner of the diagram, R; in the lower right corner, and R; separates 
R; from R3. 














524 A. L. DULMAGE AND N. S. MENDELSOHN 











Ficure 1 


This decomposition of the Cartesian product S X T into R;, R2, and R; 
is the canonical decomposition of S X T with respect to the reducible graph 
K of finite exterior dimension. 


THEOREM 8. Jf Ri, R2 and R; form the canonical decomposition of S X T 
with respect to a reducible graph K of finite exterior dimension, then (i) every 
element of K (\ Rz is admissible and (ii) K (\ R; = 9. 


Proof: Part (i) is implied by Theorem 3 for the following reasons. First, 
[As, B*] and [A*, Bs] are m.e.p.’s for K. Secondly, [S;, T,] is an exterior cover 
for S; X T, and hence, if 1 <j, [A;, B,] is an m.e.p. for K such that 
(Si X T;) VK C (A; X By). 

To prove part (ii) we note that clearly no element of K is in (Ay X B*) 
or in (A* X By). Moreover [A,-1, B;-1], which is an m.e.p. for K does not 
cover any edge of S; X T, when i > j. 


CoROLLARY. (1) Corresponding to any edge (s,t) of Re, there exists at least 
one m.e.p. |A, B| for K such that (s, t) isin A X B. 

(2) Corresponding to any edge (s,t) of Rs, there exists at least one m.e.p. 
[A, B] for K such that (s,t) is in A X B. 

(3) In both (1) and (2) the m.e.p. may be chosen from among the k + 1 


m.e.p.'s [Ax, B*}, [Ay, Bj}, [A 2, B,}, oeoes {[A,-1, B,. il, Lay. Bx] for K. 


K intersects R; in k + 2 disjoint irreducible subgraphs as indicated in the 
following theorem. 








he 








COVERINGS OF BIPARTITE GRAPHS 525 





THEOREM 9. If [Ag, B*] and [A*, Bs] are the extreme m.e.p.'s for a reducible 
graph K of finite exterior dimension, and if S = Ae. S,;U S:../.©U SU A* 
and T = B*UT,UT2...\U TU Be are the canonical decompositions of 
S and T, then (1) the subgraphs K (\ (As X B*) and K (\ (A* X Bs) are 
irreducible and their only m.e.p.'s are (As, ¢) and (¢, Bs) respectively, and 
(2) the subgraphs K (\ (S; X T;) are irreducible for i = 1,2,3,..., k. 


Proof. lf there exists an m.e.p. [A’, B’] for K (\ (As X B*) such that A’ 
is a proper subset of As, we have v(A’) + »(B’) < v(Ae). Since K(\R; 
is null, [A’, B’ U B*] is an exterior pair for K and since its dimension is 


v(A’) + »(B’) + o(B*) < v(As) + »(B*) = E(K) 


it is a minimal pair. Since A’ is a proper subset of As, this contradicts the 
fact that [A+, B*] is an extreme m.e.p. for K. Thus K (\ (As X B*) is irre- 
ducible and its only m.e.p. is [As, ¢]. Similarly K (\ (A* X Bs) is irreducible 
and its only m.e.p. is (¢, Bs). 

If there exists an m.e.p. [A’, B’] for K (\ (S; X T,) such that neither A’ 
nor B’ is null, then »(A’) + »(B’) < »(S,) = v(T,). Since [Ay1, B,] is an 
exterior cover for K(\R:, and since K (\R; is null, the pair [A, B] = 
[Ay.1 U A’, B,U B’] is an exterior cover for K. This cover is minimal, since 
its dimension is 


v(Aqi) + oA’) + o(B’) + o(By) < (Aca) + oS) + o(B,) = E(K). 


Since v(Ay1) < (A) = o(Ags) + o(S’) < o(Avs) + o(S) = (A), this 
contradicts the minimality assumption in the definition of S;. Thus 
K (\ (S; X T,) is irreducible for 1 = 1,2,..., 2. 


THeoreM 10. If K is a reducible graph of finite exterior dimension with a 
corresponding canonical decomposition of S and T, and if a denotes the collec- 
tion of k + 1 m.e.p.’s [As, B*], [A1, Bi], [A2, Be], [As, Bs], ..., (Aes, Bea], 
[A*, Bs] for K, and if 8 denotes the collection of all m.e.p.'s |A, B) for K, and if 
+ denotes the collection of 2" pairs [A, B] defined by 


a=( U S.) U Aa, B = ( U Tr.) U Bs 


teA iell 

in which A and II are complementary subsets of 1,2,3,..., , then 

(i)@CBCy, 

(ii) the admissible subset of K is K, = K (\ R,, 

(iii) the inadmissible subset of K is K; = K (\ Ro, and 

(iv) an exterior pair |A, B] is an m.e.p. for K, if and only if [A, B)€ y. 

Proof. Let K, = K\ Ry. We show, first, that [A, B] is an m.e.p. for 
K, if and only if [A, B] belongs to y. By Theorem 8, no element of K — K, 
is admissible and hence by Theorem 2, E(K,) = E(K). If [A, B] belongs to 
7 it covers K,, and 








526 A. L. DULMAGE AND N. S. MENDELSOHN 


v(A) + »(B) = v(Ae) + »(Be) + DO wu, = E(K) = E(Ki) 
i=] 


so that any [A, B] € 7 is an m.e.p. for K,. 

If [A, B] isany m.e.p. for K, then, to show that it belongs to y it is sufficient 
to show that Ay C A C A*, By C BC B*, and that, for i = 1,2,...,k, 
either A (\S,= 5S; and BO\T,;=¢@ or AC\S;=¢@ and BO\T, = T, 
Since [A, B] and [A¢, B*] are m.e.p.’s for K,, by Theorem 5, [A U Az, B (\ B*] 
is an m.e.p. for K, and its dimension is 


v(A) + e(Ag) — o(A C1 As) + (BOC) B*) = E(Ki) = E(K) = vA) + vB). 


Thus »(Ag (\ A) + »(B) — »(BC) B*) = (Ag) or v(Ag CV A) + (BO B*) 
= v(As). Thus, by Theorem 9, the exterior covering [4+ (\ A, B (\ B*] for 
K (\ (Ag X B*) is minimal and Ay(\A = Ag and BC\ Bt = ¢. Thus 
Ay CA and B C B*. Similarly, since [A (\ As, B U B*] is an m.e.p. for K,, 
A C A* and By, CB. Since [A, B] and [Ax \US;, B* — T,] are m.e.p.’s for Ky, 


[A U (Ag US), BO (BF — T)] = [AU S, B —- BOT, 
is an m.e.p. for K, by Theorem 5. Its dimension is 
v(A)+ v(S,) — v(A OVS, + o(B) — o (BO\T, = E(K)) = E(K) = vo(A) + v(B). 


Hence v(A (\ S;) + »o(BI\T,) = v(S,;). Thus, by Theorem 9, the exterior 
covering [A (\ S;, B(\ T;j of K C\ (S; X T, is minimal and either A (1) S, 
=qgand B/\T,=T, or Af\S, = S; and B(\T, = ¢. 

Since every m.e.p. for K is an m.e.p. for K1, 8 © y and hence a C BC y. 

Since 8 C y, if [A, B] € 6 and if (s, t) is any edge of A X B then either 
s € Ay and ¢€ B* or s € A, and ¢t € B, with 1 ¥j, or s € A* and t € By. 
Thus (s, ¢) is not in R;. By Theorems 3 and 8 the inadmissible subset K ,; of 
K is K (\ Rz. Hence the admissible subset is K, = K (\ R,. Since K, = K,, 
(iv) follows. 

This completes the proof of Theorem 10. 


5. Some further properties of the canonical decomposition. Any graph 
K of finite exterior dimension decomposes the Cartesian product S X T into 
three regions R;, Ro, R;. In this section the stability of this decomposition 
under alterations of the graph K is discussed as is also the role played by the 
inadmissible edges in obstructing some of the m.e.p.’s for K. 


Property 1. If the graph K is altered by the addition or removal of edges 
from Re, the resulting graph has the same core as does K and the regions Rj, 
Ro, R; are unaltered. The proof is obvious. 


Property 2. If edges in R; are added to K, the resulting graph produces the 
same decomposition of S X T as does K and hence each added element is 
admissible. The proof again is immediate. 














COVERINGS OF BIPARTITE GRAPHS 527 






Property 3. Edges may be removed from K (\ R; without changing the 
decomposition of S X T provided the following condition holds. If Ko is the 
resulting graph, then for each i the subgraph Kyo /\ (S; X T,) has exterior 
dimension u, while (S; X T,) — Ko C\ (S; X T,) has interior dimension less 
than u, in the space (S,; X 7,). A similar statement must hold for the ‘‘tails”’ 
(Ay X B*) and A* X By. Again the proof is omitted. 

If the condition given in property 3 is violated the following may occur. If 
the exterior dimension of each of the blocks Ky (\ (Ax X B*), Ko (\ (S, X TO), 
Ky (\ (A* X Bs) is the same as that of the corresponding block with 
Ky replaced by K, then in the decomposition of S K T with respect to Ko, 
some of the blocks Ay X B*, S,; Xk T;, A* K Bs may break down into smaller 
irreducible sub-blocks, the remaining parts of the blocks going into R: and 
R;. If the exterior dimension of any of the blocks Ky (\ (Ay X B*), 
Ko l\ (S;, X T;), Kol) (A* X Bx) is less than that of the corresponding blocks 
with Ky replaced by K, the whole nature of the decomposition may be des- 
troyed. Certain edges originally in R, may become admissible and some edges 
originally in R; may become inadmissible. 

If K is altered by adding edges from the region R;, the new graph may 
produce an entirely different decomposition of S X 7. As an example of the 
effect of adding a single edge of R; to K, consider the following: Let S = 
{@1, @2,...,@e}, T = [di, bo,..., de}, K = the set of all (a;, b,) with j > 7. 
Here Ax = By = ¢, and the admissible edges are (a;, b;) (¢ = 1,2,...,&). 
The irreducible block S; X T; consists of the single edge (a;, b,). Re consists 
of all edges (a;, b,) with 7 > i and R; all edges (a;, b,) with j < i. If the edge 
(ay, 6:) is added to K, the resulting block is irreducible and hence all points 
become admissible. If instead of (a,, b:) another edge (a,;, 6,;) with j7 <7 is 
added to K then in the new graph some but not all of the edges which were 
inadmissible in K become admissible in the augmented graph. 

The role of the inadmissible elements of K as obstructions to m.e.p.’s is now 
considered. The core K, of K has the 2* m.e.p.’s [A, B], 


A=(Us)U4.5-( U 7)UB, 


ieA iefl 
where A and II are complementary subsets of 1,2,...,. Because of the 
occurrence of inadmissible elements in K, some of the 2* m.e.p.’s of K, may 
not be m.e.p.’s for K. The following theorem shows that in the extreme case 
the number of m.e.p.’s may be reduced from 2* to k + 1. 


THEOREM 11. An m.e.p. (A, B], 


a=(Us)U4 B-(Ur)uUB. 


ieA tell 


A and Il as above, for the core of K isan m.e.p. for K, if and only if U(s, xX T;), 
taken over all pairs j, k, in which j < k,j7€ Tl, k € A, contains no edge of K. 








528 A. L. DULMAGE AND N. S. MENDELSOHN 


Proof. Let (s,t) be an edge of K. It is immediate that (s, 4) is in some 
(S; X T,),7 Uk, 7 € 1, k € Aif and only if s € A andt€ B. 


Coro.uary. If every set S; X T, in which j < k contains at least one edge 
of K then K has exactly the k + 1 m.e.p.’s of the collection a, namely [Ax, B*), 
[A ly B,], ee | [Ag-1, B,-.], [A*, By}. 


6. Other decompositions of S x T. We have already considered the 
decompositions R;, R2, R; of S X T. Although R; is intrinsic, depending only 
on S, T, and K, there are cases in which the sets A ,, B; are not uniquely deter- 
mined and, in such cases, R, and R; are not uniquely defined. 

We now present two completely intrinsic decompositions of S x T. 

We use @ to denote the collection of m.e.p.’s for K. By 

MN or U 
8 8 


we mean the intersection or union taken over the collection 8. 


We define 
Yi=W%= n ((A x B) U (A x B)), 
V:= n ((A X B)U (AX B)U (4A X B)) - Vi, 


V3; = U (4 x B), 
W:= U (4 x B), 
Ws=N (AX B)UAXB)ULA x B)) — Wi. 


(Note that W, may be obtained from V;, W; from V2, and W; from V3, by 
replacing all A’s and B’s by their complements.) Since, for every [A, B], 
(A x B)U (A X B)U(A X B) and A XB are complementary subsets 
of S X T, Vi, V2, and V3 are disjoint and have S X T as their union. W;, W2, 
and W; have the same property. 


THEOREM 12. If K, and K, are the admissible and inadmissible subsets of a 
reducible graph K of finite exterior dimension then 


(1) K,.CR, = Vi = Wi, 
(2) Ky, CV2CR, Cc W3, 
(3) Ws RC V3. 


Proof (1) We need only prove R; = V;. If (s, #) is an edge of Rj, if [A, B] 
is any m.e.p., then, by Theorem 10 (since 8 C y) s€ A or t€ B but not 
both. Thus (s, 4) € ((A & B) U (A X B)) for every m.e.p. [A, B]. Hence 
R, .. Vi. 

By Corollary (1) to Theorem 8, if (s, ¢) is in R2, then (s, ¢) isin A X B for 
some m.e.p. and by Theorem 8, Corollary (2), if (s, ¢) is in R;, then (s, ¢) is 
in A X B for some m.e.p. Thus R2/\ V; = R3\ V; = ¢, and hence V; C Rj. 














COVERINGS OF BIPARTITE GRAPHS 529 


(2) V2 consists of all the edges of S X T which are in every cover of K 
but not in V, (that is, not in R,). Thus K; C V2. By Corollary (2) to Theorem 
8, any edge of R; is in V3; so that V; © R». By Corollary (1) to Theorem 8, 
R: C Ws. (3) follows from (1) and (2). 

The following examples illustrate these decompositions. If S = {a;, as, 

., Qe}, T = (bi, bo,..., be} and if K is the set of all (a,, b,) with 7 >i, 
then K, = R,; = V; = Wi, K; = R: = V2 = Wz and R; = V; = W;3. 


On the other hand, if S = {a;,ao,...,a@,}, T = {b;, bo,...,d:} and K 
is the set of all (a;, d,) then K, = K = R, = V; = W, and K,; = ¢ = V2 = W;3. 
R, depends on the manner in which the construction of A;, Ao, . . . is effected, 


but, in any case, it consists of $k(k — 1) edges. For example, if (a,, d,) is 
S; X T;, then R; consists of all (a,, 6;) with i < 7. 
W, consists of all (a;, b,) with « # 7. 


7. Application to matrices and computation. In this section some 
properties of the matrix representation of a graph are studied. Throughout 
this section the following notation is used. C is a p by g matrix with non- 
negative entries of term rank p. It is assumed that p < g. S represents the 
sum of all the entries in C, M the maximum sum of the entries in any row 
or column of C, m the minimum sum of the entries in any row or column of 
C. Also to be used is the null dimension m of the matrix C, defined as the 
maximum value of u + v where C contains a u by v block of zeros. Theorem 4 
states that p + m = p + q unless p = p in which case nm < g. A problem of 
some interest is to estimate p for a given matrix C. For a large-sized matrix 
this is a problem of considerable difficulty. A systematic computing machine 
programme for the exact determination of p would involve a search through 
g!(q — p)! terms which appear in the p by p minors of C. In what follows 
estimates of p in terms of p, g, S, M, m are obtained. Furthermore, trans- 
formations in the matrix are introduced which lead to improved estimates of 
p. The results obtained can be applied to the problem of distinct representations 
of sets (6) and to variants of the optimal assignment problem in the theory 
of games. It is not to be expected that exact values of p can be obtained using 
the above-mentioned parameters only. In fact, in a recent paper, Ryser (8) 
has shown that for standard matrices (those in which the nc-zero entries 
are 1) and using the transformation of replacing 


(ot) » (Go) 


that p may be varied between two values p; and p2 while each of the para- 
meters S, M, m are held constant. 

The following result has been found recently by the authors in (2): If 
p-—r<S/M<p-—r+1, then p>p—r-+1. It will soon be shown 
how to modify the matrix C in such a way that p is held constant but that 
S/M is increased. This could lead to a better lower bound for p. 








530 A. L. DULMAGE AND N. S. MENDELSOHN 


THEOREM 13. For any matrix C with non-negative entries, p > p + q — S/m, 
unless p = p. 


Proof. lf p# p, p +n = p+ q. Suppose C has a u by v block of zeros 
with u + v = n. By adding the entries in the u rows and v columns of C 
which contain this block of zeros, it follows that wm + wm < S. Hence 
nm < S, or (pb + q — p)m < S. Hence p > p+ q — S/m. 


Coro.iary. If g > S/m, then p = p. 
THEOREM 14. For any standard matrix C, 


S — m* + m(q — p) 
q-m™ 


P? 
unless p = p. 


Proof. Suppose p # p, and C has a u by v block of zeros with u + v = n. 
u<p—m,v <q —m. The smallest possible value for the number of zeros 
in such a block occurs when |v — u| is a maximum and this occurs when 
v=q-—m and u=n—q+m=p+m —p. The maximum number of 
1’s in C occurs when all places except the u by v block of 0’s are occupied with 
l’s. 

Hence gp — (q — m)(p + m — p) > S, which proves the required inequality. 

THEOREM 15. For any standard matrix C, p > 2m — m?/q unless p = p. 

Proof. From the inequalities obtained in Theorems 13 and 14, 

gb — (q—m)(p+m—p) >S> (p+ q —p)m. 


The inequality of the extreme terms reduces to that of the theorem. 


Remark. The inequality for p given by Theorem 15 is not necessarily weaker 
than those given in Theorems 13 and 14. Also, in general the inequalities 
connecting p with p, g,S,m give a better lower bound for p than the inequalities 
connecting with p, S, M (when p = q) quoted previously. 

The estimate p > p + g — S/m will be improved if the matrix C can be 
replaced by another having the same p but a smaller S/m. In what follows 
we may always assume that m + 0 since this occurs only if some rows or 
columns of C have only zero entries. On deleting these rows and columns the 
new matrix has a value of m ~ 0. 

A matrix C* is said to be graph equivalent to C if it is obtained from C by 
a finite sequence of the following types of operation: 

(1) Interchange of two rows. 

(2) Interchange of two columns. 

(3) Replacement of a non-zero entry by any positive number. 

For the next two theorems a systematic method will be given for replacing 
C by a graph equivalent matrix C* for which S/M is increased and S/m is 
decreased. The method is easily adaptable for machine computation. 








ing 








COVERINGS OF BIPARTITE GRAPHS 531 


Let C be any p by g matrix with non-negative entries. Let M be the maximum 
value of any row or column sum in C and M* the next largest row or column 
sum in C. Let C* be the matrix obtained from C as follows: If the entry c,, 
of C does not occur in a row or column at which the maximum sum M is at- 
tained, put c*,,; = ¢4,;. If ¢y appears in a row or column at which the maximum 
sum M is attained, put 


THEOREM 16. Jf S* is the sum of all entries in C* then 


SS 
Mm? Mu’ 

Proof. Rearranging rows and columns in C will not change the values of 
the parameters S, M, S*, M*. Suppose the first u rows and first » columns of 
C have the sum M, all other rows and columns having sums < M. Partition 
C into four blocks as follows: 

Let A be the matrix c;;,i < u,j <2», 

B be the matrix ¢,,,i < u,j > 9, 

D be the matrix ¢;;,1 > u,j <9, 

E be the matrix c,,,i > u,j > »v. 
Let a, b, d, e be the sums of the entries in A, B, D, E respectively; let A*, B*, 
D*, E* be the corresponding submatrices of C* and let a*, 6*, d*, e* be the 
corresponding entry sums. Then M* is the maximum row or column of C* 
and 


M M* M* 
.. = = I ae 
a =u? 7; b,d =the e 
Also 
M* M* M* 
Sa ae * * D an Se i ae 
S* =a*+0¥ + d* +e u?t aot ai te 
Hence 
- @2eee, © £8, 6 
M* M + M* M + M* 
or 


| a 1-1 
Me $a e(3 u/) 2 ° 


Coro.iary. If the M* does not exist, then » = g and the matrix C is 
doubly stochastic so that (see (2)) p = p. If S*/M* = S/M then e = 0. 
But then C contains a block of zeros of size p — u by q — v so that, from 
Theorem 4, p < u + v. This supplies an upper bound for p. 

From the computational point of view this corollary is not as trivial as might 
first appear. For a large matrix the problem of locating a u by v block of 
zeros might require a search of prohibitive length. 














532 A. L. DULMAGE AND N. S. MENDELSOHN 


By iteration of the process with S*/M* = S,/M, a sequence of values 


SS; JS: 


uM <™; “*3,°°° 
is obtained. Either for some i, 


Si _ Sia 


M, My.’ 
in which case the corollary to Theorem 14, together with the results in (2) 
quoted previously, give upper and lower bounds for p, or else the sequence 
S Si Se Ss 
ed ees ee 
is an infinite properly increasing sequence. This sequence is bounded above 
since for all 4, 


In this case the terms approach a limit, and the result quoted previously 
together with an approximation to this limit gives a lower bound for p. 
Let C be any p by g matrix with non-negative entries. Let m # 0 be the 

minimum value of any row or column sum in C. Let C* be the matrix obtained 
from C as follows: If the entry c,,; of C does not occur in a row or column at 
which the minimum sum m is attained, put c*,;, = ¢c,,;. If c,; appears in a row 
or column at which the minimum sum m is attained put 

* * 

Ci = my Cty. 


THEOREM 17. The sum S* of the entries in C* satisfies 


Ss _S 


a < —" 
Proof. The proof is identical with that given for Theorem 16. 


The corollaries to Theorem 16 and the remarks concerning the iteration 
of the transformation have immediate analogues in Theorem 17. 


8. Concluding remarks. In the previous sections have we avoided the 
language of lattice theory in describing our results. In this language some of 
the results take on an interesting form and it is possible that the lattice formu- 
lation might lead to further ramification of the theory. We also note in what 
follows that our notion of an interior pair can be used to reformulate the 
map colouring problem. 

The m.e.p.’s of a graph K may be partially ordered in a natura! manner 
as follows: [A, B] C [C, D] if and only if A C Cand B D D by set inclusion. 
In this ordering the lattice-theoretic join and meet are given by 





et 0UmClCOD 





e 


‘¥ 








COVERINGS OF BIPARTITE GRAPHS 





[A, B] U [C, D] 


(AUC, BOD] 
and 


[A,B] O\[(C, D] = [|ANC,BU DI, 


respectively. By Theorem 5, if [A, B] and [C, D] are m.e.p.'s for a graph K, 
then [A, B] U [C, D] and [A, B] (\ [C, D] are also m.e.p.'s for K. Also, since 
the definitions have been given in terms of set inclusion, the resulting lattice 
is distributive. Hence we have the following theorem: 


THEOREM 18. Jf K is any graph, the set of all m.e.p.’s for K form a distributive 
lattice. 


Using the notation of §3, we can define complements of an m.e.p. as follows. 
The complement of [A, B] is taken as [(A* — A) U Ag, (B* — B) U Be]. 
The complement of [A, B] is not necessarily an m.e.p. for K. In fact the 
remark preceding Theorem 11, together with the proof used in Theorem 11, 
yields the following theorem. 


THEOREM 19. The lattice of all m.e.p.’s for a graph K is complemented if and 
only if K contains no inadmissible elements. 


We remark here without going into details that the region R: can be sub- 
divided into subregions such that for each of these subregions the presence of 
elements of K obstructs the complements of certain m.e.p.’s for K from being 
m.e.p.’s for K. 

In another direction it may be worth while to examine the polarity con- 
struction given by Garrett Birkhoff (1, p. 54). Let K be any graph in S X T 
and K its complement. To any subset A of S we associate the subset B = 
¥x(A) of T defined as follows: 6 € ¥x(A) if and only if (a, 6) € K for alla 
in A. Similarly with any subset B of T we associate the subset A = ¢,(B) of 
S defined as follows: a € ¢(B) if and only if (a, 6) € K for all 6 € B. The 
following properties of this construction are given in (1). For any A and any 
B, 

ox(A) > A, vedr(B) D B. Vdeb(A) = (A), Gebede(B) = on (B). 


If A = ¢,y,(A), A is said to be closed. Similarly, B is closed if B = ¥,¢,(B). 
A pair (A, B) is called a polar pair with respect to K if B = ¥,(A) and A 
= ¢,(B). If (A, B) isa polar pair with respect to K, A and B are both neces- 
sarily closed sets. To establish a connection between the concept of a polar 
pair and the concepts of interior and exterior pairs for a graph we make the 
following further definitions. An exterior cover [A, B] of K is said to be un- 
contractable if for any other exterior cover [A,, B,] such that A; C A and B, 
C B then A; = A and B, = B. By this definition an m.e.p. [A, B] is an un- 
contractable cover of minimum dimension. More generally we may say that 
the cover [A, B] is uncontractable with respect to A, if for any other cover 
[A,, B] we have A C A. Uncontractability with respect to B is defined in a 
similar way. An interior pair {A, B} for K is said to be inextensible if for any 











534 A. L. DULMAGE AND N. S. MENDELSOHN 


interior pair {A,, B,} such that A C A; and BC B, then A = A, and B 
= B,. A maximal interior pair {A, B} is then simply an inextensible interior 
pair of maximum dimension. We can now state the following theorem. 


THEOREM 20. Let K be any graph and K its complement, and let (A, B] be 
an exterior pair for K. Then 

(1) ¥e(A) > B, o(B) D2 A. 

(2) ¥(A) = B if and only if [A,B] is uncontractable with respect to B; 
¢:(B) = A if and only if [A, B] is uncontractable with respect to A. 

(3) [A, B] is an uncontractable cover for K if and only if (A, B) is a polar 
pair with respect to RK. 


The proof of Theorem 20 is immediate and is not given here. Another 
theorem whose proof we omit is the following: 


THEOREM 21. With respect to any graph K an interior pair {A, B} is inexten- 
sible if and only if (A, B) is a polar pair. 


Whether the concepts of uncontractable and inextensible pairs can lead to 
important properties of graphs is a matter of speculation. It may be worth 
mentioning here that the set of all uncontractable pairs for a graph K do not 
form a lattice in any natural way as do the m.e.p.’s for K. 

Finally, it might be worth while investigating whether the concepts of cover 
and interior pair can lead to interesting results in connection with symmetric 
graphs or with dominance matrices. While we have done no work on these 
problems, it turns out that many of the interesting problems can be formu- 
lated in terms of concepts introduced here. We give one example—that of 
colouring a map in A colours. Let M be a map of r regions aj, a2, @3,. . . , Gr. 
Put S = T = {a:,a2,...,a@,}. The graph K corresponding to the map M 
is the set of all (a,, a,;) such that the regions a, and a, are contiguous. A colour- 
ing of M using A colours consists of decomposing S into \ mutually exclusive 
sets S;, S2,S3,...,5, in such a way that the pairs {51,5}, {S2, S2},... 
{S,, Sy} are interior to the complement of K. 


REFERENCES 


1. Garrett Birkhoff, Lattice theory (revised edition) Amer. Math. Soc. Coll. Pub., 25 (1948). 

2. A. L. Dulmage and N.S. Mendelsohn, Some generalizations of the problem of distinct represen- 
tatives, Can. J. Math., 10 (1958), 230-241. 

, The convex hull of sub permutation matrices, Proc. Amer. Math. Soc., 9 (1958), 
253-254. 

4. A. L. Dulmage and I. Halperin, On a theorem of Frobenius-Kénig and J. von Neumann's 
game of hide and seek, Trans. Roy. Soc. Can. Ser. III, 49 (1955), 23-29. 

5. D. Kénig, Theorie der endlichen und unendlichen Graphen, (Chelsea, New York, 1950). 

6. H. B. Mann and H. J. Ryser, Systems of distinct representatives, Amer. Math. Monthly, 
60 (1953), 397-401. 

7. O. Cre, Graphs and matching theorems, Duke Math. J., 22 (1955), 625-639. 

8. H. J. Ryser, Matrices of zeros and ones, Can. J. Math., 9 (1957), 371-377. 

9. J. von Neumann, A certain sero sum two person game equivalent to the optimal assignment 
problem, Contribution to the theory of games II, Annals of Mathematics Studies, 28, 
(Princeton, 1953), pp. 5-12. 





3. 


Unwersity of Manitoba 





1- 











ORIENTED FLAT SUBMANIFOLDS IN AN AFFINE 
SPACE 


M. J. ENGLEFIELD 


Introduction. The simplest examples of figures which possess inner orienta- 
tion are a sensed line, and a plane in which a sense of rotation is specified. 
Suppose two sensed lines, which intersect in a finite point, are given in a 
definite order. Then there is only one way in which the first can be rotated 
to coincide with the second without passing through the second. In this way, 
two ordered, sensed lines determine a sense of rotation in the plane which 
contains them. The theorems proved below are essentially generalizations of 
this result to spaces of higher dimension, and the corresponding results con- 
cerning outer orientation. This concept is also simply illustrated in two 
dimensions: a line divides the plane into two parts; it has outer orientation 
if these two parts are given a definite order. In three dimensions, a line is 
given outer orientation by specifying a sense of rotation around it, while a 
plane is given outer orientation by assigning an order to the two parts into 
which it divides the space. 

This paper is concerned with the section E, (of dimension s), and the join 
E,, of two flat submanifolds E, and E, of an n-dimensional affine space. I 
shall demonstrate the following results: 

(i) Suppose the section is not null or improper. If pq + st is even, then (a) an 
inner orientation in the join E, is determined if inner orientations are given in 
Ey, E,, and their section E,; and (b) an outer orientation around the section 
E, is determined if outer orientations are given around E,, E,, and their join 
E,. If pq + st is odd, the results are true if E, and E, are ordered. 

(ii) The corresponding results when the section is null or improper are obtained 
by replacing pq + st by pg + st + p+ q. 

The notation used follows the kernel-index method described by Schouten 
in his recent books (1; 2), which also provide most of the terminology. The 
only geometric objects appearing in this paper are vectors, each of which 
has a kernel consisting of a capital Latin letter, usually with an index below 
or above it. The initial letters of the alphabet are reserved for contravariant 
vectors, and the last few letters for covariant vectors. The components of a 
vector in a given co-ordinate system are denoted by its kernel with the in- 
dices of the co-ordinate system on the right. Thus 


z 


Zy: (x’ = 1’,2’,..., 2’) 


Received October 25, 1957. The author is a National Research Laboratories Postdoctorate 
Fellow. 


535 








536 M. J. ENGLEFIELD 


z 
are the components of the covariant vector Z in the co-ordinate system 
(x’). Like «’, \’ is one of the running indices of the system, while 1’, .. . , n’ 
are the fixed indices. Each co-ordinate system has its own running and fixed 
indices. 

Sections 1 and 3 discuss flat submanifolds, giving some familiar results 
which are required. Outer and inner orientation are defined in §2 with re- 
ference to the covariant and contravariant domains belonging to a submani- 
fold. The result (i)(a), which was given by Schouten, is proved in §4. Finally, 
the remaining results are obtained and illustrated with simple examples. 


1. The domains of a flat submanifold. The set of points with co-ordinates 
satisfying a number of linear equations is called a flat submanifold. If n — p 
of these equations are independent, the submanifold can be considered as an 
affine space of dimension p, or E,. When all the equations, say x*U," + P* 
= 0O(u = p+ 1,...,m), are independent, they are a null form of the E,. 
For each u, the U," are the components in (x) of a covariant vector; these 


u 
n — p linearly independent vectors U span a covariant domain. The vectors 
of this domain will be said to belong to the E,. 

If BUS = 0(6 = 1,..., p), and K*U.* + P* = 0, then x* = 7°B, + K* 
belongs to the submanifold for all 9°. If the matrix [B}] has rank p, x* = 9°B} 
+ K* isa parametric form of the E,, and the sets 7” may be used as co-ordinates. 
The contravariant vectors B with components B} in (x) span a contravariant 

b 


domain which will be said to belong to the submanifold. 

A p-direction, or improper E,_,, may be regarded either as the set of all sub- 
manifolds parallel to a given E,, or as their common “points at infinity.” 
Since parallel submanifolds have the same covariant and contravariant do- 
mains, we may also say that these domains belong to the p-direction, which 
is called their support. 


2. Inner and outer orientation. Two sets of ordered, linearly independent 
contravariant vectors G and 
t 


H=. 


I 


~ paw 


G (t,j = 1,...,%) 
1 
are said to have the same screw-sense if the determinant 


| 4] 
A|l>0, 


gi 


and opposite screw-senses if 
1@ 
\A| < 0. 
‘J 


The screw-sense of the set G may be identified with the sign of the deter- 
1 


minant 














ent 


ter- 





SUBMANIFOLDS IN 





AN AFFINE SPACE 


|G" 


| 
if the allowed co-ordinate transformations 
x“ =x"A +A" (x =1,...,mj0' =1',...,n') 

are restricted to those with determinant |A‘| positive. The space is then 
oriented, and is said to have an inner orientation (or screw-sense)—that fixed 
by the contravariant basis (or measuring) vectors (1, p. 7; 2, p. 12; 3, p. 5) 
of an allowed co-ordinate system. 

Orientation in any flat submanifold E, is defined in the same way: if 7” 


(6=1,..., ) are rectilinear co-ordinates for the E,, a screw-sense is fixed 
in the EZ, by an ordered set of p linearly independent quantities 
B (8 =1,...,p) 
- 


which are contravariant vectors with respect to transformations of the 7”. 
These will be called contravariant vectors in the E,, and may be represented 
by pairs of points of the Z,. Any such point-pair also represents a contra- 
variant vector in the Z,, and in this way an isomorphic correspondence can 
be set up between the contravariant vectors in the E, and the contravariant 
vectors in the E, belonging to the E,(3, p. 20). For example, if 


x“ = ” By + K* (b’ = 1',..., 9’) 
is a parametric form for the submanifold, the contravariant vectors with 


components By in (x) correspond to the contravariant basis (or measuring) 
vectors e of the co-ordinate system (b’) in the Z,. Through this correspond- 
” 


ence, an ordered set of vectors 


B (2 eee p) 


b 


spanning the domain belonging to the E, fixes a screw-sense in the E,. The 
set 


b 
C=AB (6i,c=1,...,p) 
c b 
determines the same screw-sense if the determinant 
1d 
A|> 0. 


Inner orientation in a submanifold E, could therefore be defined in terms 
of the contravariant vectors belonging to it. The equivalent considerations 
for covariant vectors suggest the following definition: an outer orientation 


‘ u 
around an E, is defined by an ordered set U of n — p covariant vectors 
spanning the covariant domain belonging to the £,. If 











538 M. J. ENGLEFIELD 


vu u 
the outer orientations determined by the Y and the U are defined to be the 
v 
same or opposite according as the determinant |A| is positive or negative. 


u 
With these definitions, it is clear that orientation of an E, is essentially a pro- 
perty of its p-direction. 
To each set of m independent contravariant vectors G corresponds a 
t 


| 
reciprocal set of covariant vectors W satisfying 
« a J 
GW, = 5. 
t 


By this correspondence, an ordered set of m linearly independent covariant 
vectors fixes an inner orientation in the E,. 

Given an E,, let x* = n"B} + K* (y=p+1,...,m) be a parametric 
form of an E,_, with no direction in common with the E,. This means that a 
contravariant vector cannot belong to both the EZ, and the E,_,, and so their 
contravariant domains span the contravariant domain of the £,. If a covariant 
vector X is represented (1, p. 7; 2, p. 10; 3, p. 5) by the parallel hyperplanes 
x*X, = c, x*X, = c + 1, the figure intersects the E,_, in the points 9”(B)X,) 
= c — K*X,, °(B)X,.) = c — K*X, + 1, which represent a covariant vector 
in the E,_, with components B}X, in (y). If X belongs to the E,, BpX, 
cannot vanish for all y, for X would then have zero transvection with every 
contravariant vector belonging to the E, or the E,_,, and hence to every 
contravariant vector in E,. Thus an isomorphic correspondence can be set 
up between the covariant vectors belonging to E,, and the covariant vectors 
in the £,_,. Since an ordered set of m — p linearly independent covariant 
vectors in the E,_, fixes a screw-sense in it, an outer orientation of the £, 
determines an inner orientation in the E,_,.' The latter can be any E,_, with 
no direction in common with the £,.? 

This result is illustrated by the example of §6, where the line a plays the 
role of the E,, and the plane ¢ that of the E,_,. An outer orientation around 
a is fixed by the covariant vectors U and V, which are respectively represented 
by the ordered pairs of parallel planes 7;, +2 and pi, pe. Their intersections with 
a, the parallel lines shown in Figure 3, are the representations of the corres- 
ponding covariant vectors in ¢; B and C are the reciprocal set of contravariant 
vectors in o. 


3. The sections and joins of flat submanifolds. Consider an E, with 
a null form x*U,* + P* =0 (u=p+1,...,m), and a parametric form 
x* = 7°B, + K* (6=1,...,). Consider also an E, with a null form 
x*Vi+Q°=0 (vp =q+1,...,m), and a parametric form x* = ¢*Ci + L* 


1Schouten used this property to introduce outer orientation (1, p. 5; 2, p. 7; 3, p. 4), and 
then gave my definition as a property (3, p. 6). 

*The E,_, may also be considered as that arising by reduction of the space with respect 
to the p-direction of the E, (3, p. 21). 











“8 8 


nd 


oct 








SUBMANIFOLDS IN AN AFFINE SPACE 539 


(c=1,...,¢). Then B}U.* = 0 for each u and b, C?Vi = 0 for each v and 
c, K*U“ + P* = 0 for each u, and L*Vi + Q* = 0 for each v. From these 
conditions, the following lemma may be proved: if the n XK (2n — p — q) 
matrix 


Z = (U,|Vi) 
has rank n — s;, then (1) OCC p~p,p+qa—n< 51 <Q; (ii) the (p + g) 


xX n matrix 
. Bi] 
r= [2 


has rank th = p + q — 51; (iii) there exist numbers 
xf 
a4 
(d= 1,..., 51) such that 
D* = x'B; = °C; 
a 4 d 
are a complete set of solutions of the equations D*Z = 0. 
The method of proof was outlined by Schouten (2, p. 8, Examples 1.2, 


1.3). 
In the same way, if the (n + 1) XK (2n — p — q) matrix 


hasrankn — s,then(i) -l<s<pp+q—-n<s <q; (ii) the (p+q4+) 





X n matrix 
[tr] 
— 7S aos = 
has rank t = p + q — s; (iii) there are numbers a b, (w=1,...," — tt) such 
that 
Pp* Wu = Q° or ’ 
and 


w w »w 
W,= Us Wu= Vibe 

are a complete set of solutions of the equations AW, = 0. 

There are three possibilities regarding the section of E, and E,: (i) they have 
a finite point in common; (ii) they have no finite point in common, but there 
is a common direction; (iii) they have no common point or direction. These 
cases correspond to different values of s; and s. 

(i) 5, = s, implying s ¥ —1. If 














540 M. J. ENGLEFIELD 


(g = s+1,...,m) are any m — s linearly independent columns of Q, then 
x*Z: + R* = 0 is a null form of the section, which is therefore an £,. If 
M“ are the co-ordinates of any point of the section, and 


then x* = wD; + M* is a parametric form of the section. So the contra- 
variant domain of the section is the set of vectors which belong to both £, 
and E,. 

Defining 


and 
S* = P* hy = 0 bn 


x*W,.” + S* = 0 isa null form of the join, which is therefore an E,. If N“ are 
the co-ordinates of any point of this Z,, and F)(f = 1,...,¢#) any ¢ indepen- 
dent linear combinations of the rows of A(or [), then x* = 6/F;+ N* is a 
parametric form of the join. The set of covariant vectors which belong to both 
E, and E, is the covariant domain of the join. 

(ii) 5 =s+i1,s# —1. 

Since the equations 


[x*|1]2 = 0 


are inconsistent, E, and E, have no finite point in common. However, the 
vectors with components D* belong to E, and E,, and span a contravariant 
d 


domain of dimension 5), so it is natural to consider the support of this domain 
as the section of E, and E,. This is consistent with our usual conception of 
parallel manifolds, for if E, is parallel to a submanifold contained in E,, the 
section will be the p-direction of the E,. In general only a submanifold of E, 
will be parallel to a submanifold of E,: for instance, two planes in an E,4 may 
have a single common direction. The covariant domain of the section is the 
set of vectors which have zero transvection with every vector of the contra- 
variant domain. If Zi are any m — s; = m — s — 1 linearly independent 
columns of Z, the vectors with these components in (x) span the covariant 
domain. 

Null and parametric forms of the join, and hence bases for its domains, are 
obtained exactly as in (i). There, however, ¢ linearly independent rows of 4 
may be chosen from I, but in this case the last row of A must be used. 

(iii) s}s = O, s = —1. 

The equations [x*|1]2 = 0 are inconsistent, and the equations D“Z = 0 have 
no non-zero solution. This means that E, and E, have no point or direction in 
common. The previous procedure gives null and parametric forms of the 
join, which is thus an E,,,41. 








hen 
. a 


tra- 


E, 


are 
yen- 
sa 
oth 








SUBMANIFOLDS IN AN AFFINE SPACE 541 


4. Orientation in the join when the section is a proper £,. The 
simplest case occurs when the section is a single point, so that s; = s = 0, and 
t = p + q. An inner orientation for the join is fixed by an ordered set of con- 
travariant vectors spanning the domain of the join. Any parametric form of 
the join specifies such a set; hence, from the previous section, the rows of T 
may be taken. We must remember, however, that E, and E, do not uniquely 
specify the rows of I’. On the other hand, the set of contravariant vectors 
does not need to be unique, provided that the ambiguity does not affect the 
screw-sense determined. 

Suppose an inner orientation is specified in both E, and E,. For the ordered 
set of ¢ contravariant vectors, take any set fixing the given orientation in £,, 
followed by any set fixing the orientation given in E,: 

By oon pe GpoceeG 

1 rp} “ 
Any two sets chosen in this way are related by a transformation with positive 
determinant, and so specify the same orientation in the join. This selection 
implies the ordering of E, and E,. If pq is even, this ordering is irrelevant, 
because the set 

rr? rey 

1 ¢ 1 P 
is changed into the above set by a permutation with the same parity as pq; 
the determinant of the corresponding transformation is (—1)*. 

We are now ready to consider the more general case in which s; = s > 0. 
The choice of contravariant vectors belonging to the join must be modified 
because the rows of I are not linearly independent. When pq + st is even, an 
orientation in the join of E, and E, is determined by specifying orientations in 
E,, E,, and their section E,; this is true when pq + st is odd if E, and E, are 
ordered (2, p. 8, Example I. 5). 

Choose first any set of s; vectors D which determines the given orientation 

a 


in the section; then choose any set of p — s, vectors 
G (g=s5+1,...,p) 
. 


such that the set D; G determines the given orientation in E,; finally choose 
a ¢ 


any set of g — s; vectors 


H (h=s,+1,...,q) 
h 
such that the set D; H determines the given orientation in E,. The set D; G; H 
da rh a ) A 
is changed into the set D;H;G by a transformation with determinant 
a rh 9 


(—1)@-9(¢-») = (—1)?*-**, So E, and E, need not be ordered if pq + st is 

even. The necessity of fixing a screw-sense in the section is seen by the effect 

of interchanging two of the vectors D. To preserve the given orientations in 
a 








542 M. J. ENGLEFIELD 


E, and E,, we could, for example, simply interchange two of the G, and inter- 
change two of the H. The resulting three interchanges in the whole set give a 
h 


set specifying the opposite orientation in the join. 

If p = s; = s, giving an orientation in E, also gives an orientation in £,. 
In all the results, it is to be understood that when or g equals s;, the chosen 
orientations are consistent with each other. When the section is proper, the 
theorems are then trivial. 


5. Orientation in the join when the section is improper or null. 
The appropriate changes when the section is improper or null lead to the follow- 
ing theorem: if pq + st + p + q is even, an orientation in the join of E, and 
E, ts determined by specifying orientations in E,, E,, and their section E, (unless 
this is null); but if pq + st + p + q is odd, E, and E, must be ordered. When 
p (or g) equals s; = s + 1, it is sufficient to give an orientation in E, (or E,) 
and to order E, and E,. 

The changes in the proof of the theorem are due to the dimension of the 
contravariant domain of the section now being s + 1, and to the fact that no 
t rows of I are linearly independent. Choose (i) a set of s + 1 vectors D which 

da 
fixes the given screw-sense in the section, (ii) a set of p — s — 1 vectors G 


’ 
such that the set D; G fixes the given screw-sense in E,, (iii) a set of g — s — 1 


@¢ 

vectors H such that D; H fixes the given screw-sense in E,, and (iv) a vector A 
h d h 

which may be represented by a point in E, and a point in E,(d = 1,..., 51; 

g=st+1,...,p;h =5,4+1,..., 9). From cases (ii) and (iii) of section 3, 


these ¢ vectors span the contravariant domain of the join. 
As in the previous case, it is obvious that the arbitrariness in the choice of 


the D, G, and H will not affect the orientation determined in the join. Suppose 
4 ¢ 


A and A are two different selections of the last vector, and let 
1 2 


(1) A=@D+0G+¥H+ AA. 
2 a 0 h 1 


The determinant of the transformation between the two sets is then A. Let 
ZA«m =1,...,m) satisfy TZ, = 0, but not AZ, = 0. Choose the Z, so that 
Z.L* — Z,K* = 1, and put 6 = Z,K*. Then the Z, are the components of a 
covariant vector Z represented by the parallel hyperplanes Z,x* = 6, which 
contains E,, and Z,x* = 6 + 1, which contains E,. Taking the transvection 
of Z and each side of (1) gives A = 1. Thus the choice of A does not affect 
the orientation given in the join. 

E, and E, need not be ordered if the sets D; G; H; A and D; H;G; —A 

0 a 0 


d d rh 
define the same orientation in the join. The determinant of the transformation 


between these sets is (—1)@-*-) (DH = (—]1)PetsHrte, 








iter- 


ion 





SUBMANIFOLDS IN AN AFFINE SPACE 543 


@® 


FicguRE 1. (p =q=1,s5 = 0). 





—_ 


FicurE 2. (p = g = 1,5 


Figures 1 and 2 illustrate simple applications of this result. In the first, the 
set D; A fixes a screw-sense in the plane which is the join of the parallel lines. 
The sense of D is determined by a specified orientation in the section, and the 
sense of A by ordering the lines. 

In the second, an orientation is fixed in a 3-dimensional space by specifying 


senses for the skew lines a and 8. The sets of vectors H; G; A and G; H; A 
1 1 1 2 2 2 


are both measuring vectors for co-ordinate systems with right-handed axes, 
showing that it is unnecessary to order the lines. The parallel planes x and p, 
which are only drawn to help visualize the 3-dimensional figure, are a represen- 
tation, in this case, of the covariant vector Z used in the proof. Z is uniquely 
determined when n = ¢. 











544 M. J. ENGLEFIELD 


6. Orientation around a proper section. The analogous considerations 
for covariant vectors lead to certain theorems on outer orientation. Jf the 
section of E, and E, is a proper E,, an outer orientation around the section is 
determined if outer orientations are given around E,, and E,, and their join E,, 
provided that E, and E, are ordered when pq + st is odd. Of course, when 
the join is the whole space, we do not have to give an outer orientation around 
it (cf. §4 when s = 0). 


TM 7, 





™, 7; T 


FIGURE 3. (p = q = 2,5 = 1). 


As an example in E;, consider two planes x and p whose section is a line 
a. Figure 3 shows their intersections with some plane o that does not contain 
a. According to the theorem, giving outer orientations to x and p, in that 
order, fixes an outer orientation around a, which in turn fixes an inner orienta- 
tion in ¢. Let the ordered pair of parallel planes 7; and 2 represent a co- 
variant vector U which determines the outer orientation around 7; and simi- 
larly p; and p2 represent V which determines the orientation around p. The 
screw-sense in o is that given by the contravariant vectors B and C; the outer 
orientation around a is thus a clockwise rotation when seen from this side of ¢. 


7. Orientation around an improper or a null section. When the 
section E, is improper, or null, its covariant domain has dimension m — 5; 
=n —s — 1. Suppose E, and E, are ordered, and that outer orientations 
are given around E,, E,, and their join. Choose  — s — 1 covariant vectors 


Ld 
as follows: (i) a set W which specifies the outer orientation around the join 

















SUBMANIFOLDS IN AN AFFINE SPACE 545 





(w= 1,...,m — #); (ii) a vector Z, the support of which contains the direc- 
tions of both E, and E,, and the sense of which is determined by the order 


E,, E¢; (iii) a set X (x =nm—t+2,...,n — p) of t— p — 1 vectors such 
that the set W; Z; x specifies the outer orientation around E,; and (iv) a 
set iy =a-6+2,....8-¢ of t— q-—1 vectors such that the set 
W; 2; Yy specifies the outer orientation around E,. The vector Z is the same 
as that used in §5. [ft = p + 1 (or g + 1) the set x (or Y) does not appear; 


uw 
and if t = m, the set W does not appear. Suitable alterations to the proof in 
§5 show that this set determines an outer orientation around the section. 


Changing the order of EZ, and E, gives a set W: —Z;’ Y; 'X, where the sets 


Zz z 
'Y, 'X are related to the sets y, X by transformations with negative deter- 
minants. The order of E, and E£, is therefore irrelevant if (¢ — » — 1) (¢ — ¢ 
— 1) + 1 is even, that is, if pg + st + p + q is even. 

In general, then, when pq + st + p + q is even, an outer orientation around 
the (null or improper) section of E, and E, is determined if outer orientations are 
specified around E,, around E,, and around their join unless this is the whole 
space; when pq + st + p + q is odd, an order must also be assigned to E, and 
E,. lf t = p+ 1 (or t = g + 1), an orientation around the section is deter- 
mined by the orientation around E, (or E,). When t = p+ 1 = ¢ + 1, it is 
sufficient to give an orientation around the join and to order E, and E,. 

The covariant domain of a null section has dimension , so an outer orienta- 
tion of a null section may be interpreted as an inner orientation for the E,. 
Figure 2 illustrates the case of skew lines in an E;; suppose a and 6 are given 
the outer orientations shown by the arrowed circles. For the covariant vectors 





FiGuRE 4. 













546 M. J. ENGLEFIELD 


of the theorem, the set Z, X, Y reciprocal to A, H, G may be chosen. Alterna- 
2 2 2 


tively, taking the lines in the opposite order, the set —Z, ‘Y, ‘X reciprocal to 


A, —G, —H may be chosen. These sets are the measuring vectors of co- 
1 1 1 


ordinate systems with left-handed axes. 

The theorem may also be applied to two parallel lines a and 8 in a plane r 
in an E; (Figure 4). An outer orientation around the direction of a and 8 
is determined by giving an outer orientation to 7 and assigning an order to a 
and 8. The conditions respectively determine the senses of the contravariant 
vectors A and B. Rotating A into B gives an outer orientation for any line 
parallel to a and 8. 

In conclusion, I should like to thank the National Research Council for 
the award of a Postdoctorate Fellowship. 


REFERENCES 


1. J. A. Schouten, Ricci Calculus (2nd ed.; Berlin, 1954). 


2. ———,, Tensor Analysis for Physicists, (2nd ed.; Oxford, 1954). 
3. J. A. Schouten and W. van der Kulk, Pfaff’s Problem and its Generalizations (Oxford, 
1949). 


National Research Council 
Ottawa 








ford, 





ON THE LATTICE OF TOPOLOGIES 


JURIS HARTMANIS 


In many cases Lattice Theory has proven itself to be useful in the study of 
the totality of mathematical systems of a given type. In this paper we shall 
continue one of such studies by investigating further the lattice of all topologies 
on a given set S. A considerable amount of research has been done in this 
field (1; 2; 3; 5; 6). This research, besides satisfying the intrinsic interest in 
the lattice theoretic properties of this lattice, has aided the study of inter- 
connections of different properties of point set topologies. 

We shall show that the lattice of all topologies on a set consisting of more 
than two elements has only trivial homomorphisms. On the other hand it 
will be shown that this is not true for the lattice consisting of all 7,-topologies 
on S and the lattice of complete homomorphisms will be constructed in this 
case. We shall also show that the lattice of all topologies is complemented if 
S is finite. Finally we shall construct the group of automorphisms for the 
lattice of all topologies and for the lattice of all T\-topologies on S. We shall 
conclude with a definition of a lattice theoretic property which clarifies the 
change of properties of the lattice of topologies as we go from the finite to the 
infinite case. 

We shall represent a topology R on the set S by the collection of its closed 
sets, R = {S.}. If Ri and R:z are topologies on S then R; < Rz if and only 
if every set closed under R; is also closed under R». It can be seen that under 
this ordering the set of all topologies on S forms a complete point lattice. 
The intersection of two topologies R; and R, in the lattice is the topology 
whose closed sets are the sets closed under R; and R». The union of two topo- 
logies R, and R, is the topology whose closed sets are intersections of finite 
unions of the closed sets of R; and R». Let us denote the lattice of all topologies 
on S by L7(S) and similarly let L7,(S) denote the lattice of all 7)-topologies 
on S. 


We shall now investigate the homomorphisms of L7(5S). 


Lema 1. If 6 is a nontrivial homomorphism on a point lattice L, then there 
exists a point p of L such that p = 0(@). 


Proof. Let 9 be a non trivial homomorphism on L. Then there exist two 
elements a and } in L, a > 6, such that a = 6(6@). Since L is a point lattice 
there exists a point p such that a(\ p = p and b/\ p = 0. But then p = a 
\NVp=bl\p =0. 





Received January 28, 1958. 











548 JURIS HARTMANIS 


LemMaA 2. If 61s a homomorphism on LT (S) which identifies at least two distinct 
elements, then a topology of the form |, p, S}, p € S, ts identified with the zero 
element. 


Proof. The lattice of topologies on the set S is a point lattice. The points 
are topologies of the form {¢, D, S}, ¢ # D C S. Thus by Lemma 1 a topology 
of this form is congruent to the zero element, say {¢, D, S} = {¢, S} (6). 
If D = {p}, some p in S, then the lemma holds. Otherwise let p be in D. 
Then the quotients {¢,(S — D) V p,D, p,S}:{¢,(S — D) V p, S} and 
{¢, D, S} : {¢, S} are perspective. We also observe that the quotient {¢, p, S}: 
{@, S} is perspective into the first quotient. But then all these quotients are 
collapsed by the homomorphism 6 since it collapses {¢, D, S} : {¢, S}. 


THEOREM 1. There are only trivial homomorphisms on the lattice of topologies 
on a set consisting of more than two elements. 


Proof. Let @ be a homomorphism which identifies two distinct elements of 
LT(S). Then by Lemma 1 a topology of the form {¢, D, S} is congruent to 
the zero element. Assume that D contains at least two distinct elements, say 
p and g. Then by Lemma 2 we have that {¢, », S} = {¢,¢,S} = {¢, S}. Let 
us denote by R(— p) the topology on S whose closed sets are the set S and the 
subsets of S which do not contain the element p. Let R(p) denote the topology 
whose closed sets are the void set and all the subsets of S which contain the 
element p. Then we observe that R(—p) (\ R(p) = {¢, S} and R(—p) U {¢, 
pb, S} = I. This implies that R(—p) = J(@). Thus R(—p) (1) R(p) = IN 
R(p) = R(p) = {¢, S}. Similarly, it follows that R(q) = {¢, S}. Since R(p) 
U R(q) = I we conclude that J = {¢, S} which shows that @ is a trivial homo- 
morphism. In the case D consists of a single element, D = {p}, we obtain 
similarly that R(p) = {¢, S}. From this it follows, if we recall that the set 
S contains more than two elements, that there exists an element g in S, g # p, 
so that {¢,S} < {¢,¢ V p,S} < R(p). This implies that {¢,¢ V p, S} = 
{@, S}. Now we can complete the proof as in the previous case if we set D = 
PV g. 


We shall now show that the result previously derived does not hold if we 
consider only 7, — topologies on a set S. 

The set of all 7\-topologies on S forms a complete sublattice of the lattice 
of all topologies on S. We shall assume that S is infinite since otherwise the 
lattice of 7)-topologies consists of a single element. It can be seen that L7;(S) 
is not a point lattice and that the join irreducible elements are topologies of 
the form {F.,D V F,,S}, where { F,} is the set of finite subsets of S and D 
is a proper infinite subset of S. From this we conclude that if J is a join ir- 
reducible element and R is any element of L7,(S) then J (\ R is a join ir- 
reducible element, provided that J (\ R # 0. We also observe that two join 
irreducible elements are comparable if and only if there exists a finite set A 
such that they induce the same topology on S — A. From this it follows that 





+08 ame oe 


= ww 





es 


we 





LATTICE OF TOPOLOGIES 549 


the only join irreducible elements which contain a point of the lattice are 
topologies of the form {F,, (S — A) V Fa}, where A is a finite subset of S. 
Note that the points of this lattice are topologies of the form { F.,.S — p, S}, 
peS. 


THEOREM 2. The lattice of the complete homomorphisms of LT ,(S) is isomorphic 
to the lattice consisting of finite subsets of S and the set S ordered under set in- 
clusion. 


Proof. First we shall show that to each finite subset A of S there corresponds 
a complete homomorphism 6(A). To see this let R; be congruent to R; mod 
6(A) if and only if R, and R, induce the same topology on S — A. It can be 
seen that this does define a complete homomorphism on L7,(S) and that to 
two distinct subsets there correspond distinct homomorphisms. 

To complete the proof we shall show that every complete homomorphism 
is of this type. First we shall show that if a complete homomorphism @ maps 
a join irreducible element J = {F,,D V F,.,S} which does not contain a 
point of the lattice into the zero element then the homomorphism is trivial. 
Let D, be a subset of S such that D — D, and D, — D are infinite. Then 
JU { Fa, Di V Fa, S} = { Fa,Di V Fa,DV Fa, (Di A D) V Fa, Di V DV Fa, S} 
from which it follows that {F,, (D; V D) V F,,S} =0 and {F,, (Di A D) 
V F.,S} = 0. Proceeding this way we can show that all the join irreducible 
elements are mapped into the zero element. Thus because of the completeness 
of the homomorphism @ we conclude that it identifies all the elements of 
LT,(S). Assume now that @ is a non-trivial complete homomorphism which 
identifies two topologies R; and R:2, R; < R: Then there exists a join 
irreducible element J, such that Re (\ J; = J; and Ri (\ J; = J_ # Ji. Then 
J: is a join irreducible element or the zero element. If we let J; = {F., D V 
F,,S} then J; = {F.,D V A V Fy, S}, where A is a finite subset of S and 
DA A =. From this we shall show that @ identifies all topologies which 
agree on S — A. To see this let C be an infinite subset of D such that D — C 
is also infinite. Then the quotients { F., D V Fa,S}:{Fa,DV AV Fa, S} and 
{Fa, C V Fa, DV Fa, S}:{Fa,C V A V Fa,D V AV Fa, S} are perspective, 
and the quotient {F.,C V F.,S}:{Fa,C V A V Fa, 5S} is perspective into the 
second quotient. From this, since the first quotient is collapsed by 6, we obtain 
that the last quotient is also collapsed. Repeating the same argument for 
(Fa, C V Fa, S}: {Fa,C V A V Fa, S}, {Fa,C V Fa, Ci V Fa, (S — A) V Fa}: 
{[Fa, CV AV Fa, Ci V Fa, S} and {F.,(S—A) V F.} : {Fa,S}, where 
C, = S — (C V A), we obtain that {F,, (S — A) V F,} = 0. But then if M, 
and M, are any two topologies which induce the same topology on S — A we 
conclude that M,U {F,,(S—A)V FB} = M2U IR, (S —A) V F,} and 
thus M, = M2». It can now be easily seen that if for some other finite set B we 
have that {F,, (S — B) V F.} =0 then all the topologies which agree on 
S— (AV B) are identified by @. Because of the completeness of the 
homomorphism this holds for the union of any number of such subsets. But 














550 JURIS HARTMANIS 


for the homomorphism to be non-trivial there must exist a finite set V such 
that two topologies are congruent if and only if they agree on S — V. Note 
that otherwise there would exist a join irreducible element which does not 
contain a point of L7,(S) but which would be mapped into the zero element. 
But this would force the homomorphism to be trivial. This completes the 
proof by showing that there is a one-to-one order preserving correspondence 
between the non-trivial complete homomorphisms on L7\(S) and the proper 
finite subsets of S. 


We shall now investigate the problem of complementation in the lattice of 
all topologies on S. 


THEOREM 3. LT(S) is complemented if S is finite. 


Proof. Assume that S is finite. We shall call a closed non-void set C of a 
topology R minimal if no proper subset of C is a closed set of R. To construct 
a complement R’ of a topology R we pick a point from each minimal set and 
denote this collection of points by A. Let the union of all minimal sets be 
denoted by U. If U = S then let R’ be the topology on S which is generated 
by A and the subsets of S — A. If U C S then let R’ be the topology generated 
by A V (S — U) and the subsets of S — A. It can be seen that in either case 
RU R' =Tand ROR =0. 


Coro.iary 1. Let S be a finite set which consists of more than two elements. 
Then R in LT(S) has a unique complement if and only if R= 0 or R= I. 


Proof. From the proof of Theorem 3 we see that if in R, R # 0, J, a minimal 
closed set consists of more than one element then we have more than one 
way to construct the set A. Thus the complements are not unique. If all the 
minimal closed sets of R consist of a single element than either R = J or the 
union of the minimal sets U C S. In the second case we can construct one of 
the complements R’ as it was done in the previous proof. To construct a 
different complement R” for R we shall proceed as follows. Assume that 
S — U contains at least two distinct elements. If R contains a closed set 
C, C C S, such that p in C, g not in C, and p and g in S — U, then a comple- 
ment R” can be chosen to be the topology generated by all subsets of S — 
(U V p) and p V gq. If R does not contain a closed set C as described above 
then we can let R” be the topology generated by all subsets of S — U and 
b V r, where p in S — Uand,r in U. If S — U consists of a single element p 
and there is a closed set C in R such that {p} C C C S then let R” = {¢, 
(S — C) V p, S}. In the case if there is no such set C in R we let R” = {4¢, p, 
q V p, S}. It is seen that in all cases R” # R’ and R”VR=1,R"O\R=0. 
This completes the proof. 


The corresponding questions about complements of L7T(S) and L7,(S) 
when 5S is an infinite set are interesting problems and have not been answered 
in this paper. 











LATTICE OF TOPOLOGIES 551 


We shall now investigate the group of automorphisms of the lattice of 
topologies and the lattice of 7)-topologies on S. 

We shall say that the points p and q of a point lattice L form a union of 
type n if p U q contains m distinct points. 


THEOREM 4. The group of automorphisms of LT(S) is isomorphic to the 
symmetric group on S if S consists of one or two elements or is infinite; otherwise 
the group of automorphisms is isomorphic to the direct product of the symmetric 
group.on S with the two element group. 


Proof. Since the lattice of topologies on S is a complete point lattice any 
automorphism is characterized by the permutation it induces on the set of 
points of LT(S). If S consists of one or two elements then it can be seen that 
the group of automorphisms of L7(S) is the symmetric group on S. Let us 
now assume that S consists of three or more elements. We shall denote the 
collection consisting of topologies of the form {¢, p,S}, p in S, by n, and 
the collection consisting of topologies of the form {¢,S — p, S}, p in S, by m. 
Then any element from n V m forms a union of type at most three with any 
other point of L7(S). On the other hand, for every point P of LT(S), P not 
in n V m there is a point Q such that QU P is of type four. From this it 
follows that every automorphism maps the set n V m onto itself. Further- 
more, any two distinct elements from n or m form a union of type three, and 
an element from n always forms a union of type two with an element from 
m. This implies that every automorphism has to map either n onto n and m 
onto m, or n onto m and m onto n. We shall now show that an automorphism 
which maps n onto n and m onto m corresponds to a permutation of the set S 
and we know that to every permutation of S there corresponds such an auto- 
morphism. First let us show that if {¢, a, S} — {¢, 6, S} then {¢,S — a, S} 
— {¢,S — b,S}. To see this observe that the third point {¢,a V x, S}, 
x # a, x in S, contained in the union of {¢, a, S} and {¢, x, S} forms unions 
of type three with {¢,S — a, S}. Thus its image {¢, a’ V x’, S} has to form 
unions of type three with the image of {¢,S — a, S}. Let this image be {¢, 
S — a”, S}. But this can hold for all possible x in S only if a’ = a”. Further- 
more, if {¢, D, S} notin n V mthen p in D if and only if {¢, p, S} U {¢, D, S} 
is a union of type two. Thus p in D if and only if p’ in D’, which shows that the 
mapping corresponds to a permutation on S. It can be seen that the lattice 
operations are preserved under this mapping and that to distinct auto- 
morphisms of this type there correspond distinct permutations of S and vice 
versa. By a similar argument one can show that if there exists an auto- 
morphism which maps n onto m and m onto n then this automorphism corres- 
ponds to a permutation. on S followed by a complementation: 


{p, D, S} — {¢, D’, S} — {¢,S — D’, S}. 


Such a mapping preserves the lattice operations if S is finite. Thus for every 
permutation on S we have an automorphism which maps the permuted ele- 











552 JURIS HARTMANIS 


ments of their complements. Thus if S is finite and contains three or more 
elements the group of automorphisms is isomorphic to the group which is 
the direct product of the symmetric group on S and the two-element group. 
If S is infinite then there can be no automorphism which maps n onto mand m. 
onto n because \/ n <  m and thus the lattice operations are not preserved. 
In this case the group of automorphisms on L7(S) is isomorphic to the sym- 
metric group on S. 


THEOREM 5. The group of automorphisms of LT,(S) is isomorphic to the 
symmetric group on S if S is infinite. 


Proof. We recall that the points of L7,(S) are topologies of the form 
{ F., S — p, S}. Any automorphism on L7,(S) has to map the set of points 
onto itself. Similarly every automorphism has to map the set & of join ir- 
reducible elements of L7,(S), that is, the set consisting of topologies of the 
form {¢, D V F,, S}, D proper infinite subset of S, onto itself. But since any 
topology of L7,(S) can be written as a union of topologies of 2 we see that every 
automorphism is defined by the permutation it induces on the set %. On the 
other hand, we note that p is in D, S — D infinite, if and only if {F., D V Fu, 
S} U {F.,S — p, S} does not cover {F.,D V F,,S}. Similarly p is in D, 
S — D finite, if and only if {F.,S — p, S} is not contained in {F,,D V F,}. 
Thus every automorphism is characterized by the permutation it induces on 
the set of points of LT,(S) and every permutation of the set of points defines 
an automorphism. Thus the group of automorphisms is isomorphic to the 
symmetric group on S. 


We shall conclude by giving a definition of a lattice theoretic property which 
clarifies the difference between the lattice of topologies on a finite set and an 
infinite set, and which is in general useful in the study of point lattices. Let L 
be a complete point lattice with the set of points P = {p,q,7,s,...,}. If 
A C P then let 


A = A {B|ACBCP;>,qin Bandr < pUq implies r in B}. 


DeEFInition. Let L be a complete point lattice with the set of points P. 
Then L is said to be tall if for every A C P, \U A =a, we have that A = 
{p € Pip <a}. 

This means that if L is a tall lattice then L is completely determined if we 
know the unions of pairs of points in Z and it is the largest possible lattice 
which can be constructed with these given unions of pairs of points. Note 
that if we consider a lattice of subspaces of a geometry (4) then A is the 
smallest subspace which contains the set of points A. The following result can 
now be obtained. 


THEOREM 6. LT(S) is a tall lattice if and only if S is finite. 











n 
L 


If 








LATTICE OF TOPOLOGIES 


or 
or 
oo 


REFERENCES 


1. R. W. Bagley, On the characterization of the lattice of topologies, J. Lond. Math. Soc., 30 
(1955), 247-249. 

2. R. W. Bagley and David Ellis, On the topolattice and permutation group of an infinite set, 
Math. Japon., 3 (1954), 63-70. 

3. G. Birkhoff, On the combination of topologies, Fund. Math., 26 (1936), 156-166. 

4. Juris Hartmanis, Two embedding theorems for finite lattices, Proc. Amer. Math. Soc., 7 
(1956), 571-577. 

5. Edwin Hewitt, A problem of set-theoretic topology, Duke Math. J., 10 (1943), 309-333. 

6. R. Vaidyanathaswami, Treatise on set topology, Indian Math. Soc. (Madras, 1947). 


Ohio State University 
and 
General Electric Research Laboratory 














EXEMPLE EFFECTIF D’'UN ENSEMBLE 
TRANSFINIMENT NON-PROJECTIF 


F. ROTHBERGER 


. Introduction. 

. Méthode de Lebesgue (élimination de nombres transfinis). 
. Définition effective de P, et d’ensemble non-projectif. 

. Démonstration. 

. Simplifications. 


or WON 


1. Introduction. La notion d’ensemble projectif! de classe finie P, peut 
étre généralisée pour les classes transfinies P.(a < 2). Nous allons ici con- 
struire effectivement un ensemble qui n'est pas projectif, méme dans ce sens 
généralisé.? 

Dans ce qui suit, les variables, x, y, z parcourent les nombres irrationnels. 
Quoique pas nécessaire, cela simplifie les définitions, puisque la courbe de Peano 
établit une homéomorphie entre l'ensemble J des nombres irrationnels et 
l'ensemble J? du plan, ou J* (points irrationnels de l’espace) etc. Donc, puisque 
les classes projectives sont topologiques, ceci nous permet, d’une sorte, de 
négliger le nombre de dimensions de l’espace. 

On définit les ensembles analytiques A comme projections des G;, ou, plus 
simplement, comme projections des ensembles relativement fermés dans J". 
Soit A = P;. Ensuite, on a les compléments analytiques CA, leurs projections 
PCA = P;, PCPCA = P;,.... 

Un ensemble qui est somme (dénombrable) d’ensembles P, ( variable) 
sera de classe P,, et on définira P.4, = PCP., et ainsi de suite, pour tous les 
a <Q. Comme on sait, la classe A contient tous les ensembles B de Borel; 
la classe P, = PCA contient tous les ensembles B, A, CA; et en général la 
classe P, (pourvu qu'elle existe) contiendra toutes les classes précédentes, 
c’est 4 dire, Ps et CPs pour tout 8 < a. 

On vérifie que toutes ces classes sont topologiques. 


Remarque. Puisque l’opération P fait perdre une dimension, on peut rem- 
placer P par Px ov x est la transformation péanienne de la droite J sur le 
plan J*. C'est a dire, on transforme un ensemble linéaire en un ensemble plan, 
et fait la projection ensuite. On pourra aussi remplacer PCP par PCPr:;, od 
x; est la transformation péanienne a trois dimensions. Ainsi on restera toujours 
dans la méme dimension. 


Regu le 3 avril, 1957. 

1Pour renseignements généraux, on pourrait consulter les livres classiques (1; 3). En 
particulier, pour la méthode de Lebesgue, voir (3, pp. 298-320). 

2En principe, la possibilité de construction effective d'un ensemble non-projectif est connue 
depuis longtemps (5). Du point de vue de projectivité finie, P,, est déja non-projectif (4, p. 12). 


554 








1ue 











ENSEMBLE TRANSFINIMENT NON-PROJECTIF 555 


Définition. Un ensemble universel pour la classe P, est un ensemble dans 
l'espace J", ou bien, pour fixer les idées: dans le plan (x, y), tel que, pour y 
variable, les sections paralléles 4 l’axe des x parcourent tous les ensembles 
linéaires de classe P, (les classes inférieures inclus); les répétitions sont per- 
mises. 

Nous allons stipuler encore que l'ensemble universel doit étre, lui-méme, un 
ensemble projectif (a plusieurs dimensions), quoique peut étre d'une classe 
supérieure a a. 

Pour démontrer l’existence d’un ensemble proprement de classe P.4, il 
sufit de démontrer I’existence d’un ensemble universel pour la classe P,. 
(Pour les classes limites, par exemple P,, il n'y a pas de difficulté.) 

Ainsi, la méme méthode qui sert 4 démontrer que les classes P;, P2,... 
P,,... sont distinctes, peut étre utilis¢e pour les classes P,; ce qui entrainera 
l'existence d’ensembles proprement P, pour tout a < Q. 

Notre but est de construire des ensembles projectifs universels P, pour 
une suite transfinie de a—>®. Plus précisément, nous allons construire un 
ensemble dans l'espace (x, y, ¢), dont, pour ¢ variable (0 < ¢ < 1), les sections 
paralléles parcourent des ensembles universels pour des classes aussi hautes 
qu'on veut. Cet ensemble sera ainsi, en certain sens, universel pour tous les 
ensembles linéaires projectifs de toutes les classe a < 2, donc, lui-méme, non 
projectif. La définition sera formalisable en termes logiques et arithmétiques, 
c'est A dire, avec les symboles V, &, —, non, (9m), (m) (quantificateurs du 
1** ordre), (9x), (y) (quantificateurs sur variables réelles, 2°™* ordre), et les 
symboles d’arithmétique; et encore um quantificateur du troisiéme ordre, qui 
intervient pour éviter une définition implicite. 

Ici, les variables parcourant les nombres naturels (ou, si l'on veut, les 
nombres rationnels) sont considérés comme variables du 1° ordre, les variables 
x, y,..., parcourant le systéme des nombres réels (ou bien, ce qui revient au 
méme, parcourant le systéme des ensembles de nombres naturels) seront des 
variables du 2*™* ordre; enfin, une variable ¢(x) ou ¢(xy),..., ou encore, 
une variable parcourant le systéme de tous les ensembles de nombres réels 
(c’est a dire, les ensembles d’ensembles de nombres naturels) sera du troisiéme 
ordre. (Une variable irrationnelle sera du 2*™* ordre comme une variable 
réelle, etc.) 

Remarquons que ce troisiéme ordre (ou bien |'implicité) est essentielle: 
(2); tout ensemble qui est explicitement arithmétisable avec le calcul fonctionnel 
du second ordre est au plus projectif de classe finie. 


2. Méthode de Lebesgue pour éliminer les nombres transfinis. 
D’abord, soit ¢ un paramétre réel 0 < ¢t < 1, en développement dyadique: 


t= 0 .G:0003...G,... 


et soit: 


Va, Va. Fae + 0 © 





jes as 








556 F. ROTHBERGER 


une suite bien définie de tous les nombres rationnels, fixée pour tout ce qui 
suit. 


A chaque ¢ correspond l'ensemble T de tous les r, tels que a, = 1. Par 
exemple: 


ie 2 oe ee eee ee ee ae ae 


t 
7 = (-, T2, "3, "» 's, =. 7, 's, To, "> o ces 


Dans le cas d’ambiguité de certains ¢ rationnels convenons que T soit l’en- 
semble fini. (L’autre, n’étant pas bien-ordonné, n‘aura pas d’intérét.) Pour 
définition formelle, voir définition 2, plus bas. 


Définition 1 Nous dirons que ¢ est bien-ordonné du type a, si T, ordonné 
selon la relation <, est bien-ordonné du type a, formellement; ord ¢t = a. 


Définition 2 Nous écrirons r € ¢t au lieu de r € T. Ou bien, formalisé: 
r, € t-=- [2"¢] = 1 (mod 2). 


Définition 3 Soit t; < t, si T; est un propre segment (Abschnitt) de 7». 
Formellement: 


Il 


ty < te: =: (Gr) (7) <r’ Ch =r € h& rr €C h&r' <r. 


(on pourrait éliminer le € selon la définition 2.) 
On voit sans peine que: 


LEMME. 


§ (max ord r) +1, r<t 


di= : 
- lim ord rf, t<t 


 eclon les deux cas). 


Ainsi, au lieu de définir des ensembles E, par induction sur a, nous pouvons 
définir des ensembles E, avec une induction sur ¢ par la relation <. 


3. Définition effective des P. (a < 2) et d’un ensemble non-projectif. 
Soit x; = 23;(x, z, 2’) une homéomorphie entre J et J* (courbe de Peano). 
Nous la supposons effectivement déterminée par une définition arithmétisée. 
De méme, supposons donnée effectivement une autre, a une infinité de dimen- 
sions, qui fait correspondre les nombres irrationnels y aux suites (y1, Ye, Ys, . - - 


Yn, - - -) de nombres irrationnels, d’une maniére biunivoque et bicontinue. Au 
lieu de la notation y, = 2,(y), m = 1, 2,3,..., nous écrirons 


Yn = Fr), 
(c’est 4 dire, remplacant I'indice naturel par un indice rationnel, qui parcourt 
la suite 71, ro, 73,...,). Ici encore, les définitions sont arithmétisables, dans 
le méme sens. 
Maintenant nous allons définir les ensembles annoncés dans |’introduction 
par une fonction propositionnelle ¢(t, x,y), (0Q<t<1, x € J, y € J). Iei 
x sera la variable principale, y et ¢ des paramétres. Moyennant les symboles 





~~ mo @ 


ns 





ENSEMBLE TRANSFINIMENT NON-PROJECTIF 557 


logiques et arithmétiques (avec quantificateurs du 1°" et 2™* ordre) et des 
notions formalisables en ce sens, cette fonction ¢ est définie implicitement par 
induction transfinie rel. < , comme il suit: 


(1) $(0, x. y) -=- (Gr) x-(y) <x <r. 
et, si ¢ est bien-ordonné et ¢ + 0: 
(2) o(t, x, y) -=- (z)(2’) (Jr) o(t,, wa(x, 2, 2’), r-(y)) 


oui ¢, est le segment de ¢ jusqu’a r (r € #); plus précisément, 
Pet,-=- re t&r’ Et&r <r. 


Ainsi, si r ¢¢, on aura t, = 0 (c'est a dire, l'ensemble vide). 
Donc, supposé que ¢(r, x, y) est défini pour tous les r < ¢(¢ bien-ordonné), 
la formule (2) définit ¢(t, x, y). Ainsi ¢ est défini pour tous les ¢ bien-ordonnés. 
Si ¢ n'est pas bien-ordonné, on peut définir, soit: 


(3) o(t, x, y) -=-0 = 0, (c'est a dire, toujours vrai), 
soit: 
(3.1) o(t, x, y) -=-0 = 1, (c'est a dire, toujours faux). 


Notons que “ft n'est pas bien-ordonné”’ est formalisable ainsi: 
(4) (Ar)(r/) (Gr) re roret.&.rn€ r&r <r. 


Remarquons, en passant, que le quantificateur réel (jr) est essentiel: en 
effet, c’est l’exemple classique d’un ensemble analytique; mieux arithmétisé: 


2 (4r)(m)(4n) .*. t. < %m .&. (2"r] = 1(mod 2): & 
&: (2"r] = 0 V [2] = 1 (mod 2). 


On pourrait combiner (2) et (3) en une formule pour tous les t # 0 
(2*) ¢(t, x, y) :=: (second membre de (2)) -V- (formule (4)). 


Je dis que l'ensemble i, x,y o(t, x, y) est non-projectif. 


4. Démonstration. Nous allons démontrer que ces ensembles ont les 
propriétés envisagées, c’est A dire, que x ¢(t, x, y) parcourt tous les ensembles 
projectifs et que xy ¢(t, x, y) sont des ensembles projectifs universels. 

Pour voir la signification de (2), soit 

3 3 
E.. = x o(t, x,y), a, = x o(t, ws(x, 2,2’), y), E, = xy o(t, s(x, z, 2’), y). 
Ona 


E..= rere © Eu. 


en vertu de (2), od 








558 F. ROTHBERGER 


Xu a = = + + ae a 


Ici y1, ¥2, ¥3,..., sont des variables indépendantes, donc les sommands sont 
indépendants l'un de l'autre; et si les 
3 
FS 


sont universels de certaines classes Ps, la somme donnera tous les ensembles 
(extraits de J*) de la forme 


> Pr 


3 
E. 
sera donc universelle "pour la classe d’ordre max 8, ou lim 8,. L’opération 
PCPC augmentera ce nombre de 2. 

Pour |l’interprétation de (1), remarquons ceci: si le paramétre y parcourt 
ses valeurs, l'ensemble x $(0, x, y) parcourt toutes les sommes d’intervalles 
(y, r) avec y'< r, y € J et r rationnel. Mais ces intervalles (y, r) forment une 
base pour les ensembles ouverts dans J; donc x $(0, x, y) parcourt tous les 
ensembles ouverts, et xy ¢(0,x,y) est un ensemble universel pour la classe G 
des ensembles ouverts. 

Ensuite, si ord ¢ = 1, l'ensemble xy $(t, x,y) est de la forme PCPC ¥ E®, ot 
E est universel pour la classe G, donc xy o(t, x, y) sera universel pour la classe 
PCPCG, c'est a dire, pour la classe P; (puisque PCPCG = PCPF = PCA 
= P,, ot PF = A, voir introduction). 

De méme, si ord ¢ = 2, l'ensemble xy (t,x, y) sera universel pour la classe 
PCPC > Pz = P, (puisque la somme >, qui correspond a (gr), est dénom- 
brable), et ainsi de suite. Si ord ¢ = w, on trouve PCPCP, = P.+2. 

En général, si ord t = a, l'ensemble xy o(t, x,y) sera un ensemble universel 
pour une classe Ps, et on voit par induction transfinie que 8 — Q avec a > Q. 


Cette somme 


9) - 
(Plus précisément, 8 = j2n pourra=n< A 


\2a + 2 pour a > w 


Nous pouvons donc, pour tout a effectif (c’est a dire, tel qu’on peut nommer 
un ensemble de nombres rationnels de ce type, ordonné rel <), nommer 
effectivement un ensemble universel de classe P, au moins. 

Notre construction montre aussi (voir introduction) l’existence d’en- 
sembles proprement P, pour tout a < Q; et sit, y, parcourent toutes ses valeurs, 
l'ensemble x ¢(t,x, y) parcourt tous les ensembles projectifs de toutes les 
classes (finies ou transfinies). Ainsi, l'ensemble ixy ¢(t,x,y) (dans l’espace J*) 
ne peut pas étre projectif d’aucune classe: autrement, s'il était de classe 








_—_a—-lC 











ENSEMBLE TRANSFINIMENT NON-PROJECTIF 559 


P,,, toutes les sections linéares x $(t, x, y) le seraient aussi, donc de classes 
bornées a < ap. c.q.f.d.* 

Toutefois, cette définition est, comme déja dit, implicite (voir 2, p. 248) a 
cause de la définition (2). A présent, nous allons la rendre explicite moyennant 
un quantificateur du troisiéme ordre. Désignons par (1’), (2’), (2*’), les 
formules qu'on obtient de (1), (2), (2*), en remplacant ¢, x, y, respectivement 
par t’, x’, y’. Voici la formule: 


(5) ix y (3d) (t’) (x’) (y’) : (1) &. (2”) Vf = 0 &. o(t, x, y) 
ou bien ce qui revient au méme: 
txy (¢) : (t’)(x’)(y’) . (1’) &. (2%) V tf = 0: o(t, x, y). 


Ainsi, la définition explicite de cet ensemble est arithmétisable dans le 
calcul fonctionnel du troisiéme ordre, c.q.f.d. 


5. Simplifications. Permettant des modifications assez légéres et inessen- 
tielles de cet ensemble, on peut beaucoup simplifier sa définition. 

D’abord, la définition initiale (1) peut étre remplacée simplement par celle- 
ci: 
(6) o(0,x,y) = x<y<cxtl. 

Ensuite, sans changer les sections avec ¢ bien-ordonné dans l'ensemble 
(5), on peut remplacer (2*) par (2) et, comme nous allons voir, les équivalences 
par des implications, c’est 4 dire, remplacer (6) et (2) par 


(7) V(0,x,y) ~x<y<x+letyl(t,x, y) > Frit,x, y) pour? ¥ 0, 


ot F(t, x, v) -=- (fz) (2’)(3r) v(t,, rs(x, 2, 2’), r,(y)). 

On obtient ainsi un autre ensemble non-projectif, mais tout a fait analogue, 

que voici: 
(8) txy (Jv) (quant){y("', x’, y’) 3: A.&. Bo y(t", x", y')} & vit, x, 9), 
(quant) = (’)(x’)(y’) 2) @)GNCI)E)O"), 
on) A -=-t #0Vx' <y <x +1, 
B-=-U 40k =U &x" = a;(x's, 2’) & yy" = 2,(y’). 

Pour interpréter cette formule, il suffit de substituer respectivement ¢’ = 0 
et #0 dans {...}; on trouvera précisément la condition (7) pour 
v(t, x’, y’). 

Reste a justifier le remplacement de “‘=” par ‘‘“—’’. Observons que, pour 


tdonné, Fy dépend seulement des t, < t, et que F est une “fonction croissante”’ 
en ce sens que 


“a 


y-o:-:hY— Fo (t # 0). 


*Naturellement, cette non-projectivité peut aussi étre démontrée sans difficultés sérieuses, 
par une adaptation du procédé de diagonale. 








560 F. ROTHBERGER 


Soit maintenant ¢ fixe satisfaisant 4 (6) et (2), donc @ = F¢, et soit ¥ une 
fonction variable satisfaisant a 


(9) v(0, x, y) > (0, x, y) et y— Fy pour ¢ = 0. 


Considérons seulement les valeurs bien-ordonnées de ¢. On trouve par 
induction transfinie sur ¢ que 


v(t, x, y) — o(t, x, y), donc (Jy satisf. a (9)) W(t, x, y) > o(t, x, 9), 
d’od il vient sans difficulté 


o(t, x, y) -=- (Jy satisf. a (9)) H(t, x, y) 


pour ¢ bien-ordonné. Remarquant que (9) équivaut a (7), nous pouvons donc 
remplacer (6) et (2) par (7), ce qui nous donne facilement (8). 

Cette formule (8) ne contient que frois occurences de ¥; mais on peut 
démontrer qu’une formule de cette sorte, avec une ou deux occurences seule- 
ment, donnerait toujours un ensemble projectif. 


BIBLIOGRAPHIE 


1. C. Kuratowski, Topologie | (Warszawa-Lwéw, 1933). 

2. C. Kuratowski et A. Tarski, Les opérations logiques et les ensembles projectifs, Fund. 
Math., 17 (1931), 240-248. 

3. N. Lusin, Legons sur les ensembles analytiques (Paris, 1930). 

4. W. Sierpinski, Les ensembles projectifs et analytiques (Mémorial des sciences mathéma- 
tiques, Fasc. 112) (Paris, 1950). 





5. , Sur les ensembles de points qu'on sait définir effectivement, Verhandlungen des inter- 
nationalen Mathematiker-Kongresses Ziirich, 1932, 1. Band, p. 280. 
Uniersité Laval 


Québec 











CONGRUENCE REPRESENTATIONS IN ALGEBRAIC 
NUMBER FIELDS II. SIMULTANEOUS LINEAR 
AND QUADRATIC CONGRUENCES 


ECKFORD COHEN 


1. Introduction. Let f and \ be positive integers and p a positive odd 
prime. Suppose further that P is an ideal of norm p/ in a finite extension F 
of the rational field. In (2), which will also be referred to as I in the present 
paper, we obtained the number of solutions NV,(m) of the quadratic con- 
gruence, 


(1.1) m = ax; +... +a,x," (mod P”), 


where m is an arbitrary integer of F, and the a, are integers of F prime to P. 
In this paper we consider an analogous problem in simultaneous representation 
involving both linear and quadratic congruences. In particular, we shall 
determine the number of simultaneous solutions N,(m,n) of the pair of 
congruences 


m= ax +...+a,x,° (mod P*) 
(1.2) 


= Bixt+...+ Bx, (mod P*), 


where m and m are arbitrary integers of F, and the a, and 8, are integers of F 
prime to P. 

As in I wé make use of the theory of exponential sums in algebraic number 
fields. However, the method of the paper requires only the most elementary 
properties of the generalized Cauchy-Gauss sums (§2). The function V,(m, mn) 
is completely and explicitly evaluated in Theorem 1 (§3), and on the basis of 
this result, precise solvability criteria for the congruences (1.2) are deduced 
in §4 (Theorem 2). In contrast with the three cases of insolvability of (1.1) 
obtained in I (see the Remark at the end of the present paper), there are 
thirteen cases in which the simultaneous congruences (1.2) may have no 
solutions (Theorem 2). Another striking difference between the results for the 
two problems lies in the fact that (1.1) is always solvable for s > 3, while the 
minimal value of s such that the pair of congruences (1.2) is always solvable 
is s = 5 (Corollary (2.1)). 

In the special case 4 = 1, the congruences in (1.2) may be interpreted 
as simultaneous equations in the Galois field GF(p’). The problem of deter- 
mining N,(m,n) in this case was solved in [3], with comparatively simple 
results. In that paper it was shown that N,(m,n) > 0 in case A = 1. By 
contrast, if \ > 1, two distinct cases of insolvability may occur in (1.2) when 


Received November 6, 1957. 














562 ECKFORD COHEN 


s = 4 (see Theorem 2). In addition, there exist only four insolvable cases 
of (1.2) in case \ = 1, as compared with thirteen in the case of arbitrary A. 

Another important special case arises when f = 1, in which case (1.2) may 
be viewed as a simultaneous pair of rational congruences (mod *). This 
problem was considered by O’Connor, using a quite different method, in 
connection with a more general investigation (4). However, his results were 
fragmentary except in the case \ = 1. The results of the present paper can 
therefore be viewed as a completion and extension of some of O'Connor's 
results on rational congruences. 


2. Exponential sums. Let us choose an ideal C of F, not divisible by the 
prime ideal P, such that @ = PC is principal. Any integer p of F hasa represen- 
tation (mod P*) of the form p = @‘t where \ > ¢ > O and (£, P) = 1. In this 
representation ¢ is uniquely determined and, in addition, ¢ is uniquely deter- 
mined (mod P) if t # X. (If ¢ = A, one may assume ~ = 1.) We let D denote 
the ideal different of F and choose an ideal B such that ¢ = B/P*’D is 
principal. Further, let T(p) denote the trace function in F. Then we place 
fe = [P-*(0 < k < A), where ¢ = {, and define e,(p) = exp(2riT (pf,)) with 
e(p) = e:1(p). This is the exponential function (mod P*) introduced by Hecke 
(4, §54) and discussed under a different notation in I. The function e(p) is 
an additive character (mod P*) and has the simple properties: eo(p) = 1, 
éx(p) = ex(p’) if p =p’ (mod P*), e(p6’) = e_,(o) for k>j>O0, and 
€x(@1 + G2) = e(a@1)e,(@2). In addition, if P is of norm p/, p prime, then 


Sp™ (P* |p) 
2.1 = ’ 
are sort en) = VQ (Pip). 
Suppose that A > k > 0 and let a and 6 be integers of F. The notation 
¥(a) will be used to denote the Legendre symbol (a/P) in F. We now intro- 
duce several trigonometric sums which will be needed in our discussion: 


(2.2) G.(az)= > e&(ax’), G(a) = Gi(a), 
z(mod P*) 
(2.3) G (a) = 2 _vrde(ax), G"(a) = Gi(a), 
(2.4) Cx(a) = 2e_tx(ax), c(a) = ¢x(a), 
(2.5) S,(a,b)= >> e(ax* + 2bx). 
z(mod P*) 


The functions G,(a) and G*,(a) are the Hecke sums (4, $54), (a) is Rade- 
macher’s sum (6, §2.2), and S,(a, 5) is the generalization to F of the Cauchy- 
Gauss sum (3, (1.7)). 

Suppose that P is odd as well as prime. The Hecke and Rademacher sums 
possess the following useful properties. If k > 0 and a = @‘u(mod P*) where 
k>t>O0, (wu P) = 1, then 











CONGRUENCE REPRESENTATIONS 563 


(k—-)) 7* 
(2.6) G:(a) = +4 niles teh 
(ep w= 8) (k <2) 
(2.7) e(a) = \-~/"” ((=k-1) 
. (otherwise) ; 
if k > 0, then . 
4 k = 2j) 
(2.8) G(l) = ew ( = 2j +1), 
(2.9) Gi(u) = W*(u)G,(1); 
moreover, 
(2.10) G*(u) = G(u) = ¥(u)G(1), 
(2.11) G*(1) = ¥(-1)p”. 


For a more detailed discussion of these sums we refer to I. 
The function S,(a, 5) has the following reduction property (cf. Carlitz 
(1, Lemma 3) in the rational case). 


LemMa 1. If k > 0 and a is defined as in (2.6) then for P odd 
(2.12) S,(a, b) = perx_.(—1r*/u)Gr-s(u) or 0 


according as b is or is not divisible by P*, r being defined by b = 0'r (mod P*) 
in case P''\b. 


Proof. A complete residue system (mod P*) is given by x = 6*-‘ + z where 
y and z range over complete residue systems modulo P‘ and P*~‘ respectively. 
Hence, by the elementary properties of e,(p), we have 


S,(a, 6) = » x(a” + 2be) 2. ¢1(2by). 


z(mod P*- 
If P* 4 b, then S,(a, 5) = 0 by (2.1). Suppose then that P'|d, so that bmay be 
written (mod P*) in the form 6 = 6'r. Then by (2.1) 
Si(a,b) =p Do e-e(us® + 2rz) 
) 


z(mod P*-* 


2 2 
rede Bloor) 
P of Ae pe ‘ wid a , 


But since z + r/u and z range together over complete residue systems (mod 
P*-'), the Lemma follows immediately. 


3. Evaluation of N,(m,n). In the remainder of the paper f, A, and s will 
denote positive integers. As in the Introduction, P will represent an odd prime 
ideal of F with norm p’, p being a rational odd prime. The letters a,, 8; (i = 1, 
..., 5) will denote integers of F prime to P, while m and n will represent arbi- 
trary integers of F. In addition, we define 














564 ECKFORD COHEN 


2 2 
a= a) Qs, B= Ps + + Bs 
a as 
and write 
m = 0°M, n = 0°N, B = 08’ (mod P*) 


where a, 5, d are non-negative integers < A, and M, N, 8’ are integers of F 
not divisible by P. We also define for b > d, 


¥= Nn’9”"-¢ _ mp’ y= @"y’ (mod P), 
where A > kh > 0 and (7’, P) = 1. We place 
6 = min(8, d) = min(0, h), 


it being understood that » = 6 if h is undefined. Also we put 
m 1 (s even) Han - (d even) 
~ (2r + 1 (s odd) ~ (2D + 1 (dodd). 


Let / denote any one of the integers a, 5, h, \. Then we place 


Ee). [Ld 


where [x] denotes the greatest integer < x, and 1, = ay, 5,, hy or Ay according 
as 1 = a, 6, h, or X (4 = 0, 1, 2). We also write 


‘i= min (ay, 6), bk = min (do, bs) 


and define for integers u, v, 


my (u odd, u < v) 
winia lo (otherwise), 
, a (u even, u < v) 
ssaaealea lo (otherwise). 
If ¢ > —1, we define further 
a se+(4—-2r) _ 4 
P,(c) = » pi | aT (r ¥ 2) 
= 
c+} (r = 2), 
o ” S(o+1)(3—2r) 
Cfo) = 2, gr oe ge 
j=0 p — 1 
e es sail 
W.(c) = : pun — pat (r # 1) 
j= 
\. +1 (r = 1). 


The final formulas for N,(m, m) will be expressed in terms of the following 
functions: 








CONGRUENCE REPRESENTATIONS 565 


(3.1) Ai = (p! — 1) {(p’P, (ts — 1) + W((—1) 2) Pe (ts) } 
pl" {L (a, 8) + ¥((— 1)"a)L'(a, 8)} +p”, 


(3.2) A: = p'(p’ — 1)Q(t4 — 1) — pL, 8) 
+ pierre y((— 1)’aM)L'(a, 8) + 
(3.3) B, - ptt 1)0,(hy _ D eis 1) = gfCee-en-ey. (h, d) 
+ phi V((— Lary’ )L'(h, 2), 
(3.4) B, = phOHMG 29-49 (pt — 1) ¥((—1)'a)Q,(h2 — D — 1) 
+ pm" y(y’)L(h, d) int ie 1)’a)L'(h, r), 
(3.5) Bs = p'?"(p" — 1) {pW (as — D — 1) 


+ ¥((—1)"8’a)p-"W, (he — D)} 
— pO LL (h, d) + W((—1) a8’) L'(h, d)}, 


(3.6) By = pO L(y )L(h, 4) + ¥((—1)"'aeB'y’)L’ (h, d)}. 


We are now in a position to state and prove our first main result. 


THEOREM 1. The number of solutions N,(m,n) of the pair of congruences 
(1.2) is given by the following formulas: 


If d > », then 
(2-1) (7-1) 
” Ai (s = 2r) 
(3.7) N,(m, n) = o-wena! (s = 2y + 1). 
If d < n, then 
phP-Dr-0 4 | + pirr-D4D)p 
ph PDD 4 | 4 pi Mr-D+D) B 
(3.8) N,(m,n) = pro-Dar-D 4, ms pi rtr-D+D) B or 


sO—1) (27-1) sA(2r—1)+D) 
p Ar+p B,, 


according as (i) s = 2r,d = 2D, (ii)s = 2r7,d =2D+1, (iii) s = 2r + 1, 
d = 2D, or (iv) s = 2r+1,d=2D+1. 


Proof. The proof will be divided into four parts. 


Part I. Our method is based on the Fourier representation (3) of N,(m, n) 
as a function of m and m (mod P*). In particular 


(3.9) NAmnb=p° Do > o(u,v)ex(— mu) e(— nv), 


u(mod P>) e(mod P*) 
where 


o(u,v) = yo ex(u (ari +...+ a,x?)) @,(20(Bixi +... + Byx,)) 
( 


I] Sy\(am, Bw). 











566 ECKFORD COHEN 


Placing u = U@—* (k = 0,1,...,A) and summing over U(mod P*), (U, P*) 
= 1, we have 


7 
(3.10) N,(mn)=~p"*>S > i. ex(— mU) e(— 2nv). 


k=0 (U,P*)=1 o(mod P*) 
( > Sy\(a,Ue™, ge) . 
i=1 


By Lemma | it follows that S,(a,U@—*, 8) = 0, unless » can be expressed 
(mod P*) in the form v = @-*y, in which case, using (2.9), 


S\(a,U@™, Bw) = p’? y*(aU)ex(— Biy’/aWU)G,(1). 


Substituting in (3.10) and summing over y(mod P*), one obtains, on the 
basis of the definitions of a, 8, and S,, 


r 
N,(m,n) = p?°? DO p™ y*(a)Gi(1) . 
k=0 
y v**(U) ex(— mU) s(- Fs - n) ‘ 
(0, P*®)=1 
If k < d, then by Lemma 1 


B _.\_ Sp” (k <b) 
(3.12) s{- = n) = lo (k > b)’ 


(3.11) 


while if k > d, we obtain by (2.9), 





( eo a e’) (e¥ 1 >a) 
(3.13) al-$,- ) }e%v ( oi) & \— gr) Gea) 
lo (6 < d), 
We now separate the k summation in (3.11) into three parts to obtain 
(3.14) Nimn)= e+ R+d, 
2 3 


where k = 0 in ¥1,d >k >1 in Yo, and k > din Ds, it being agreed that 


vacuous sums shall have the value 0. It follows immediately that 
(3.15) ag 

1 
and by (3.12) that 


8 

(3.16) Qo = p™ Do pM Py*(a)Gi(l) De w*(U)ex(— mU). 
2 k=l (U, P®)=1 

Also, by (3.13) and the definition of 7, one obtains 


0 or 


» 
gfe Ot 94 _. 6’) p> ore aB’)Gi(1 )Gy-a(1). 


=d+ 


> wuyrer?te,(22Z) , 
(U. P*)=1 8 


ll 


(3.17) > 











CONGRUENCE REPRESENTATIONS 567 


according as d > 7 or d < n, because the inner U sum reduces in every case 
either to c¢,(7/8’) or to G*,(y/8’), and these functions vanish when h < k — 1, 
by (2.7) and (2.6) respectively. 


Part II. With reference to (3.16) we write 


(3.18) D=-h+h, 


22 


where k = 27 in So, and k = 27+ 1 in Soe. 


Case 1(s = 2r). Applying (2.8) one obtains 
$1 
> - a X, pre, (— m), 


which becomes by (2.7) 
(3.19) 2p = ph (9! — 1) p/P, — 1) = PE LG, 8)}. 


21 


By (2.8) and (2.11), 


Ly 
= ph P-D-OY((— 1)'a) SS pM — me), 


22 j=0 


so that by (2.7) 
(3.20) p> = phO-PO-YY((— 1)'a) { (p! — 1) Pete) — pL (a, 8)}. 


Case 2(s = 2r + 1). In this case we have by (2.8) 
y > -_ _—— > pen, (— m), 
j=l 


21 


which becomes, on applying (2.7), 

(3.21) = p81 pp! — 104 — 1) — pL Ga, 8)}. 
21 

By (2.8) and (2.11) we have 


b> . 
> = pee?-y((= 1)a)G(1) d, ptt -*9G 5 41(— m), 
j= 


2 
and hence on the basis of (2.6), (2.10), and (2.11), 
(3.22) p> = phO-vGr-D+aG-2N+4((— 1) aM)L (a, 8). 
This completes the evaluation of >». 

Part 111. Referring to (3.17) in the case d < n, we place 
(3.23) — V=eLV+1d 
where k = 27 in }>3: and k = 27 + Lin doa. 

Case 1(s = 2r, d = 2D). In this case by (2.8) 














568 ECKFORD COHEN 


Al 
$(2\(r—1)+D) f4(1—2r) i 
y= pre, {2), 


31 j=D+1 
and hence on the basis of (2.7) 


(3.24) p> _ "ia, maa 1)Q,(hy — 1) 


— see || d)}. 
By (2.8) and (2.11) 


Ae . 
DY = prererr--y((— 1)"ap")G(1) pro-mngs(Z), 


32 j=D 


which by (2.6), (2.10), and (2.11) gives 
(3.25) > = tie, (On 1)’ay’)L’ (h, d). 
3 
Case 2(s = 2r, d = 2D + 1). By (2.8) 
a -_ of r- D4) 4 ( B’)G(1) > prcringt (x), 
31 j=D+1 B 


so that by (2.6), (2.10), and (2.11) 
(3.26) > — ies At d). 


31 


Applying (2.8) and (2.11), we have 
} ae pf 4-140 4g (( — 1)"a) > gf ”., ” (2) : 


32 j=D+1 


which becomes by (2.7) 


(3.27) a mi gPOnste-188 4 (( 1)"«) { (p’ ois |” peceieat, © a DD mn 1) 


32 
= "ey % r)}. 
Case 3(s = 2r + 1, d = 2D). By (2.8) 


Al 
= 1)+D) ~2fr4, 
> y pa(%), 


31 j=D+1 


and hence on the basis of (2.7) 

3.28 = pf M2r-D+D) 4 SAD+DI-)—D st __ W.(h, - D—1 

(3.28) 2 =~ \p (p’ — 1)W, (hs ) 
on meee )}. 

Applying (2.8) and (2.11), one obtains 


Az 
he = "pata 1)'a6’) t or tawa(%) 
j=D 


32 
and thus by (2.7) 
(3.29) > a pf OGr-+D-1) 4 (( 1) ap’) { (p’ = 1)p??"-'W, (he a 
32 
a r=. )}. 


D) 





— — am 
~* ~ sn eS 








CONGRUENCE REPRESENTATIONS 569 
Case 4 (s = 2r + 1,d = 2D + 1). In this case by (2.8) 


A 
y = perry encay + pigs (2) 
j=D+1 B 


31 


which by (2.6), (2.10), and (2.11) gives 
(3.30) p i ine 4 d). 


31 


Applying (2.8) and (2.11), one obtains 


Ae ‘ 
ym ii gf Or 9. 1)’a)G(1) 2 G3,.(4) ’ 


32 
and hence by (2.6), (2.10), and (2.11), 
3.31 _ SACQQr—1) + (a+ ID (1—1)+D) — 1 r+1 -. L’ h, r : 
(3.31) =? ¥((— 1)"ap’y’)L’(h, d) 


This completes the evaluation of }-; in case d < 7. 


Part IV. We now combine the results of the preceding parts. By (3.1), 
(3.15), (3.18), (3.19), and (3.20), 


(3.32) p> + » = (Orr, (s = 2r), 
and by (3.2), (3.15), (3.18), (3.21), and (3.22) 

(3.33) p> a » SS iene (s = 2r + 1). 
By (3.3), (3.23), (3.24), and (3.25) 

(3.34) > oY gene (Case 1,d < n); 
by (3.4), (3.23), (3.26), ond (3.27) 

(3.35) p> as gf O 1B (Case 2,d <1); 
by (3.5), (3.23), (3.28), and (3.29) 

(3.36) > an 9/OGr-0t0)B, (Case 3, d < n); 
by (3.6), (3.23), (3.30), and (3.31) 

(3.37) of" (Case 4, d < n). 


3 
We also have by (3.17) 
(3.38) p 
3 
The theorem follows on combining the following formulas: (3.14), (3.32), 
(3.38) in case d > n, s even; (3.14), (3.33), (3.38) in case d > n, s odd; (3.14), 
(3.32), (3.34) in case d < 9, s even, d even; (3.14), (3.32), (3.35) in case 
d < n, s even, d odd; (3.14), (3.33), (3.36) in case d < n, s odd, d even; and 
(3.14), (3.33), (3.37) in case d < n, s odd, d odd. 


Il 
cr) 


(d > 7). 














570 ECKFORD COHEN 


Although the formulas for V,(m,m) are quite complicated in the general 
case, they simplify remarkably in certain special cases. For example, by taking 
A = 1 in Theorem 1, one obtains the compact results already proved in (3). 
We also note the following simple corollaries which result easily from the 
theorem. 


Coro.iary 1.1. Jf b = 0,d > 0 (P4¢n, P\8), then 
(3.39) N,(m,n) = p?“*”. 
Coro.iary 1.2. Jf d=h=0(P%8, P47), then 


: : § gf S-0-0(gf-0 + ¥((- 1)’ay’)) or 
4 N = (A(Q2r—1)—1r) rT r , 
_— olen, #) = Y procer-v—r (pir _ y((— 1) 'af’)), 


according as s = 2r or 2r + 1. 


Coro.iary 1.3. If a = 0, b > d > h, then 


, 


. ae al v((— 1)’x)) or 
(3.41) N,(m,n) = Proven ym + ¥((— 1)’am)), 


according as s = 2r or 2r + 1. 
Coro.tiary 1.4. If a= 0,d > b> 0, then N,(m, n) is given by (3.41). 


4. Solvability criteria. Theorem 1 presents a means for determining directly 
all cases for which (2.1) is insolvable. To accomplish this, one must first sim- 
plify the formulas for NV ,(m, n) for small values of s(s < 5). In obtaining these 
simplifications, it is useful to observe that ¥(—a) = 1 in case s = 2,d>0, 
and that the condition d < 7 always implies that d < a. However, we omit 
the simplified formulas, since they involve numerous subcases and, moreover, 
are of little interest beyond the verification of our second main result, which 
we now state. 


THEOREM 2. The function N,(m,n) vanishes (that is, (1.2) is imsolvable) 
if and only if one of the following sets of conditions is satisfied: 


(l)s=1h<xX 


(2) s=2,d>n,a<6 

(3) s = 2,d < n, d even, h even, h < r, ¥(—ay’) = —1 
(4) s = 2,d < n, d even, h odd, h < 

(5) s = 2,d < n, d odd, h even, h <r 

(6) s = 2,d <n, d odd, h odd, h <x, ¥(y’) = —1 

(7) s = 3,d> 7, a een, a < 6,¥(—aM) = —-1 

(8) s =3,d>n,a 0dd,a <6 

(9) s = 3, d < n, d even, h odd, h < X, ¥(—a8’) = —1 
(10) s = 3, d < n, d odd, h odd, h < X, ¥(y’) = —1 
(11) s = 3, d < n, d odd, h even, h < i, ¥(aB’y’) = —1 
(12) s = 4,d > 7, a odd, a < 5, ¥(a) = —1 

(13) s = 4, d < n, d odd, h odd, h < X, ¥(a) = ¥(7’) = —1. 





eral 
cing 
(3). 


the 











CONGRUENCE REPRESENTATIONS 


On the basis of Theorem 2 one obtains 


COROLLARY 2.1. The minimal value of s such that N,(m,n) > 0 for all odd, 
prime-power ideals P*, for all coefficients, a;, 8, prime to P, and for arbitrary m, 
n is given by s = 5. 


It will be observed that conditions (12) and (13), under which N,(m, n) = 0, 
fail to arise in case \ = 1. We therefore have a result proved previously (3, 
§3). 

COROLLARY 2.2. If X = 1, then Nu(m, n) > 0. 

We also note 


CorOLiary 2.3. If ¥(a) = 1, then N,(m,n) > 0. 


Finally, it will be observed that the only cases of insolvability which can 
occur when A = 1 arise from cases (1), (2), (3), and (7) of Theorem 2 (3, 
Theorem 2). 


Remark. By I, in contrast with Theorem 2, the congruence (1.1) is insolvable 
(N,(m) = 0), if and only if one of these three sets of conditions is satisfied: 
(1) s=1l,a<\,a odd; (2) s=1,a < \, a even, ¥(aM) = —1; (3) s = 2, 
a < 4, aodd, ¥(—a) = —1. This result is not stated explicitly in I but follows 
immediately from (2; (8.4), (8.5), (8.8)). 


REFERENCES 


1. Leonard Carlitz, Weighted quadratic partitions (mod p*), Math. Z., 59 (1953), 40-46. 
2. Eckford Cohen, Congruence representations in algebraic number fields, Trans. Amer. 
Math. Soc., 75 (1953), 444-470. 

3. Eckford Cohen, Simultaneous pairs of linear and quadratic equations in a Galois field, 
Can. J. Math., 9 (1957), 74-78. 

. Erich Hecke, Vorlesungen ueber die Theorie der algebraischen Zahlen (Leipzig, 1923). 

. R. E. O'Connor, Quadratic and linear congruence, Bull. Amer. Math. Soc., 45 (1939), 
792-798. 


a 


University of Tennessee 








WELL DISTRIBUTED SEQUENCES 
F. R. KEOGH, B. LAWTON, anp G. M. PETERSEN 


1. Introduction. In this note we discuss some properties of well distributed 
sequences. We take 0 < a < 6 <1 and let J(x) denote the characteristic 
function of the interval [a, 5], so that 


I(x) = \ ifx € fa, d] 


0 otherwise. 


For convenience, we suppose that our sequences (s,) satisfy 0 < s, < 1 for 
every positive integer m. A sequence (s,) is said to be well distributed if 


n+p 


— 
(1) lim= >> I(s;)) =b-—a 

Pa p k=n+1 
holds uniformly in n, for every interval [a, 6]. This may be regarded as a more 
stringent test of the regularity of distribution of a sequence (s,) than the 
classical uniform distribution condition, where 


Pp 
(2) in 2 > I(x) =b-a 
P< k=1 
for every [a, 6]. By a well-known theorem of Weyl (1), the condition (2) 
may be expressed alternatively as 


lim = D> e(hs) = 0, be i®..< 


N00 k=1 
where e(#) denotes e****. A similar condition for well distributed sequences 
has been given by Petersen (4). Thus, (s,) is well distributed if, and only if, 


n+p 


(3) im + > e(hs,) = 0, (h = 1,2,...) 


P02 k=n+1 


untformly in n, and this is the basis for our proof of Theorem 1. Throughout, 
we shall use {@} to denote @ — [6], where [6] is the largest integer < @. 


2. THEOREM 1. If (s,) is well distributed and sy —t—~0 as k — @, then 
(t,) ts well distributed. 


(With routine changes, the word “well’’ may be replaced both times by 
“uniformly”’.) 


Proof. We will suppose that 


Received February 28, 1958. 








or 


re 
he 


2) 


by 





WELL DISTRIBUTED SEQUENCES 573 


1 n+p 


(4) —~ > e(hsy) 0, h = 1,2,.. 


p k=n+1 


uniformly in m, as p— © and that 5 —%—~0 as k— @. Then, by (3), it 
suffices to prove that 


1 2 


(5) = e(ht,) — 0, 2 @ BQicse 


P kent 
uniformly in m, as p— @. 
Let « > 0. By our supposition that 5 —t—0 as k— ©, there is an 
my, > 0 such that 


(6) le(h(tm — Sm)) —1| <€ forall m> mo, eo 
Here, my may depend on h but is independent of m. Also, by our hypothesis 


concerning (4), there is a po independent of m such that 


n+p 


1 ; 
= p e(hs,) 


p k=n+1 


(7) 








<e forall p> po, h=1,2 


We apply these inequalities to the following identity: 


(3) 1S ct.) 1S ef) +2 S clbsr)(e(be — &)) — 1) 
p k=n+1 p k=n+1 p k=n+1 


and estimate the absolute values of the sums on the right. For the first, we 
simply use (7). For the second, it is convenient to consider two cases according 
as m > my or n < mo. If m > mo, we use (6) and obtain the trivial estimate 
p-'(pe) = «, valid for all integers p > 1. But if » < mo, we express it in two 
parts: 

i mo 41 n+p 


P kan+1 . f.7 p Somett 


Then, by applying (6) to the second term on the right, we get 








1] = 2mo 
9 o> ptt + an 
9) p 2» 1 . p 5 (be) 


since the summand is at most 2 in absolute a Thus for p > po’ = 2moe™', 
the terms on the right of (9) cannot exceed 2e. Combining the two cases we 
see that, for all p > max (po, po’), 


n+p 


1S ehh) 


P kan+1 


<e+t 2e = 3e, 








by (8). This completes the proof. 
3. THEOREM 2. If (s,) is a countable everywhere dense sequence in the interval 
(0, 1), then (s,) can be enumerated in such a way as to be well distributed. 


Proof. It is known that {k@}, where @ is irrational, is well distributed. Since 
(s,) is everywhere dense, we can select a subsequence (s,’) so that 














574 F. R. KEOGH, B. LAWTON, AND G. M. PETERSEN 


In’ — {h0}] <j. 


The terms of (s,’) may exhaust those of (s,) in which case our statement 
follows from the previous theorem. But if this is not the case, we omit the 
terms s,s’ and this gives us a countable set of spaces to fill anew and we fill 
them with the set made up from those s,; not used and the s,’ omitted. This 
change will not affect any interval since if r, = 0 for k ¥ v* and r,s = 1, then 


> n=0 


1 
Pa Pp k=n+1 


uniformly in m (see Lorentz (2)). Hence in either case we have a well dis- 
tributed sequence. 


THEOREM 3. For every irrational 0, there exists a sequence (m,) such that 


5s ,>1 
Ne-1 


and {n,0} is well distributed. 


Proof. Since {n@} is well distributed and everywhere dense we can choose 
a sequence (m,) with m/m1>X> 1 such that |{m,0} — {k0}| < 1/k. 
Theorem 1 then gives the result immediately. 


4. Any real number @ can be represented uniquely in the form 


tm2 2102... 04 
where a, and ¢, are integers with a; > 1 for all iand 0 < c, < a; — 1, see (3). 


THEOREM 4. If a; > ay-1 + 1 for every i, 


(1 «.)ef. b= 1,2,..., 


t=—1 
is well distributed if, and only if, 


(2) 


1s well distributed. 
Proof. We have 


‘ 
1( 1 «.)o} = SH 4 O24 Oy R,, 


Qx+1 Ap+10xn+2 Ax+1 
Since 


lim Revs = 0, 
Ka 


our statement follows from Theorem 1. 





is 1 


Sir 
dis 


wl 


al 





ee Ve Fry 





WELL DISTRIBUTED SEQUENCES 575 


For every such sequence (a,), we can evidently construct a @ such that 


(1 ef 


is well distributed by choosing c, so that 





| 
lim | & — {kv2}| =0. 
ka ay 

Similar remarks apply to uniform distribution. We have {#!@} uniformly 
distributed for almost all @ (5, Satz 21). Hence, if 


=> %, 


n= 7: 


(a,/n) is uniformly distributed for almost all 6. 

In the special case when a; = r for all i, we have our numbers expressed 
to the base r. For any r, a number @ is said to be a normal number if and 
only if the sequence {@}, {r@}, {r7@},..., is uniformly distributed. By a 
theorem of Hardy-Littlewood (1, Ch. IX, §28) it is known that almost all 
§ are normal. For a result in the opposite direction, we have 


THEOREM 5. If p, g are positive integers, the sequence 


(2). k= 1,2,... 


Proof. We may suppose that the sequence is uniformly distributed since 
otherwise, there is nothing to prove. Then, given N however large, we can 
find an m = m(N) such that 


is not well distributed for any 0. 


J p” \ T 
eS <a 
Then 
m+N k N F kum N = m \ 
> Ao" %0) -2 Ao b & ) - 2 Ao" ot ol) 
k=m+1 k=1 q q k=l q 
where 
ew-ep” | _ apg" _ x 
0< pq lon f < ap <q for allk < N. 
Hence 
| m+N k N - 
ps co" A s)| > > cos— = N/V2, 
k=m+1 q k=1 4 





and the result follows from our criterion (3). 














576 F. R. KEOGH, B. LAWTON, AND G. M. PETERSEN 


REFERENCES 


1. J. F. Koksma, Diophantische Approximationen, Ergebnisse der Mathematik und ihrer 
Grenzgebiete, Vol. IV (Berlin, 1936). 

2. G. G. Lorentz, A contribution to the theory of divergent sequences, Acta Math., 80 (1948), 
167-190. 

. I. Niven, Irrational numbers, Carus Monographs, no. 11 (1956), 164 pp. 

. G. M. Petersen, Almost convergence and uniformly distributed sequences, Quart. J]. Math. 
(Oxford), 7 (1956), 188-191. 

5. H. Weyl, Ueber die Gleichverteilung von Zahlen mod. Eins, Math. Ann., 77 (1916), 313-352. 


mw 


University College of Swansea 
Unwersity of London 
University of New Mexico 





ihrer 


948), 








FURTHER IDENTITIES AND CONGRUENCES FOR THE 
COEFFICIENTS OF MODULAR FORMS 


MORRIS NEWMAN 


I. If m is a non-negative integer, define p,(m) by 


> p-(n)x" = [] (1 —x*)’; 


otherwise define p,(m) as 0. (Here and in what follows all sums will be extended 
from 0 to © and all products from 1 to © unless otherwise stated.) ,(m) is 
thus generated by the powers of x~'/**n(r), where 


n(r) = exp(air/12) I] (1 — x"), x = exp 2zir, 


is the Dedekind modular form. In (1) it was shown that recurrence formulas 
for these coefficients depending on a parameter p, p a prime, exist for all 
positive integral r. The number of terms in these recurrence formulas is in 
general a function of r and p, which is determined in (1). If r is even, 0 < r 
< 26, it was shown in (2), (3) that three term recurrence formulas exist for 
these coefficients for p satisfying appropriate congruence conditions with 
respect to 24 as modulus. These include, for example, Mordell’s identity for 
r(n) = Pos(n = 1): 
t(mp) = r(n)r(p) — p''r(n/p). 


p,(n) bears some relation to the function q,(m), the number of representations 
of nm as a sum of r squares. If 


rT 


n= >, 4(3xi + x) 


k=l 


is a representation of m as a sum of r pentagons, then p,(m) is the excess of 
the number of those representations of m in which 


rT 


>> Xx 


k=l 
is even over those in which it is odd. Since the associated modular form is of 
fractional dimension when r is odd and of integral dimension when r is even, 
identities for odd r lie deeper than identities for even r; and indeed quadratic 
reciprocity symbols appear. A good example is furnished by the identity 


(1) qs(np*) = \p +1-—- (=") tan) - 1p - (=")}a(*) 
given by G. Pall in (7). 
Received January 9, 1958. The preparation of this paper was supported (in part) by the 
Office of Naval Research. 
577 











578 MORRIS NEWMAN 


In this paper we study the coefficients p,(m) for r odd, 0 < r < 24. We shall 
demonstrate the existence of identities of type (1) for all primes p > 3, and 
for p = 3 when r is a multiple of 3. Most of the discussion that follows de- 
pends upon (1), and we assume familiarity with the contents of this paper. 

After this paper was written the author received from J. H. van Lint a 
copy of his dissertation, “Hecke Operators and Euler Products’ (October 
1957, University of Utrecht), which contains a proof of formulas (5) and (11) 
of the next section. (There are minor inaccuracies in van Lint’s expression for 
formula (5).) van Lint’s proof is based upon properties of modular forms 
while the author's is based upon properties of modular functions. The methods 
are quite different and yield different results in general. 


II. Let p be a prime. If g(r) is a function on I'o(p), we say that g(r) is 
entire if it is regular in the interior of the upper 7 half-plane and has polar 
singularities at most in appropriate uniformizing variables at the two parabolic 
vertices r = 0,1@ of the fundamental region of I'o(p). We require the following 
lemma: 


LemMa 1. If g(r) is a function on To(p), then so is g(—1/pr). If in addition 
g(r) is entire, then so is g(—1/pr). 


Proof. The second statement is clear, since the substitution r’ = —1/pr 
permutes the parabolic points r = 0,1 and takes interior points of the upper 
r half-plane into interior points of the upper 7 half-plane. To prove the first, 
let 


belong to I'o(p), and let 


be the matrix of the transformation r’ = —1/pr. Then 
d <—-c 
T,MT,* = BE: 7 = M,, 


where M, also belongs to T'o(p). 

Suppose now that g(r) is a function on I'o(p), and put f(r) = g(—1/pr) 
= g(Tpr). Then f(Mr) = g(T,Mr) = g(MoT pr) = g(Tyr) = f(r), so that f(r) 
is also a function on I'p(p). The lemma is therefore proved. 

As in (1) we write 7,g(r) = g(7,7). 

Following the notation of (1), let » be a prime > 3, and Q a power of . 
Define 


+4 Q a square 
¢= ° 
1 otherwise, 


and set 








5 
r 
. 


r) 





COEFFICIENTS OF MODULAR FORMS 579 


h(r) = LQ") | 


n(er) 


l 0 
= BA -' 


Then if r is an integer, it is shown in (1) that the function 


Q-1 
Fr, b, Qit) = Do h'(Rar) 
is an entire modular function on I'o(p). Define 


G(r, p, Q; r) = T,F(r, p, Q; 1). 


By Lemma 1, G(r, p, Q; r) is also an entire modular function on I'o(p). It is 
shown in (1) that 


ees = 2. 
av. n.aie) = (8) v(t) 2 (49H). 


We write 2:Q in a summation to indicate that m runs over a reduced set of 
residues mod Q. We shall prove the following lemma: 


LEMMA 2. Suppose that Q is a square, and put Q’ = Q/p. Then 
F(r, p, Q; r) + G(r, p, Or) = Flr, b, QO; br) + G(r, p, Q; pr). 
Proof. Put 


fn = h’ (Rar) - 4 1eeORer) 


n(PR,r ) 
Then 
Q-1 Q’—1 
F(r, p, Q; 7) = » Bn = » Ln + 2 Lnp- 
Now 


m(PORn»t) _ a(PO'R,pr) 


1(PRapt) n(Rapr) * 
which implies that 


F(r, p, Q; 7) = > gn + F(r, p, Q’; pr). 


Thus we need only consider >>,:¢2.- This sum is treated in (1), where it is 
shown by means of the transformation formula for the Dedekind »-function 
that 


Le = Ow (r) © a(t) 
n-Q n-Q 


Transforming this sum by means of the identity 














580 MORRIS NEWMAN 


> f(n) = > f(n) - > snp) 
we find easily that 
2, & = G(r, p, Qi br) — Gr, », Q's 7). 
The lemma is thus proved. 


The functions so defined are also entire modular functions on I'o9(p) when 
p = 3, ifr isa multiple of 3. We assume from now on that r is odd,0 < r < 24; 
and that ~ is a prime such that p > 3 when (7,3) = 1 and p > 2 when 3}r. 


We put 
7] 
ee ed eee 
and define 
_ Jl p=1 (mod 4) 
eli p = 3 (mod 4) * 


LemMaA 3. The function 
f = Flr, p, p?; tr) + Gr, D, p; 1) 
is constant. 
Proof. From (3), formula (2.5.2) and (1), page 354 we have 
(2) Fr, p, p37) =x" T] Q-2")-x*) "+ 


ap’? I] (1 Po x")~" > ("=*)p.ome" 


where a = a, exp { —iar(p — 1)/4}, and 


(75*) 


is the Legendre-Jacobi symbol of quadratic reciprocity; and 


(3) G(r, p, p;t) = px T] (— x”) dD p,(mp + 8)x". 


Similarly, from (1, p. 354) we have (since rv < p?) 


(4) Gr, p,e°37) = 2p" T] (Q— 2)" Dd py (np* + rv)". 
(We take this opportunity to correct an error in the second displayed formula 
for T,F on page 354 of (1). The coefficient should be Q(pQ/e)-"” instead of 
P(pQ/e)-"”.) 
From Lemma 2 with Q = p? we have that 
f = Flr, b, pb; pr) + G(r, p, p*; pr), 


which is regular at r = i~ by formulas (2) and (4). In addition, 


T,f = F(r, p, pb; r) + G(r, p, p?; 7) 











ula 
| of 





COEFFICIENTS OF MODULAR FORMS 581 
so that (2) and (4) imply that f is regular at r = 0 as well. Since f is an entire 
modular function on I'o(p), this implies that f is constant, proving the lemma. 


If we consider the expansion of 7,f in powers of x as in (1) we obtain our 
principal result, by comparing coefficients of like powers of x: 


THEOREM 1. For all integral n, 


rv 


(5) p(np* + rv) — yap-(n) + p*p, ("=") = 0, 


where 


Yn =C— (7=#)pron, and c= p,(rv) + (2) 


If in this identity m is replaced by mp + 6 = np + rv — pu, 


(") 


vanishes since p|\rvy — m and we obtain 
COROLLARY 1. Put A = p*5 + rv. Then for all integral n, 


(6) p-(mp* + A) — ep,(mp + 8) + pr(*=#) i 


This identity is equivalent to the statement that the functions 1, F(r, p, p; r), 


F(r, p, p*; r) are linearly dependent. Another expression for c, obtained by 
choosing m = 0 in (6), is 


We also have 
COROLLARY 2. If n — rv is not divisible by p* then 
py (np? + rv) = Yap,(n). 


We go on now to some applications of Theorem 1. Suppose that r > 5. 
Then y, = c = p,(rv)(mod p), so that 


(7) pb, (np? + rv) = p,(rv)p,(m) (mod p), r> 5. 


We choose r = 11 


Py(rv) = p1:(77) 
(8) P1u(13°n + 77) = 661:(m) (mod 13). 
It is known (5; 8) that’ 

(9) p(13m + 6) = 11pi:(m) (mod 13). 


13 in (7) as a significant example. Then from (4), 
6257 = 6 (mod 13), so that 


Combining (8) and (9), we obtain the following congruence for the partition 
function mod 13, already given in (5): 











582 MORRIS NEWMAN 


Coro.iary 3. If » = 6 (mod 13), then 
p(13*n — 7) = 6p(m) (mod 13). 

We can also obtain a general congruence mod p from (7), similar to those 
given in (5; 6). 

THEOREM 2. Suppose that r > 5. Let q be an arbitrary integer, and set R = 
qp* + r. Then for all integral n, 
(10) Pe(np* + rv) = p,(rv) Per s(m) (mod ). 

Proof. We have 


DX pre(n)x* = T] (’— x)" 


=[T] a—-x”)(1 —-— x)’ (mod #). 
Thus 
palm) = 2 palk)Pe(n — p’k) (mod ?). 


Replace n by np? + rv. Since rv < p*, we obtain 


Pr(np* + rv) u by(k)p,((m — k)p* + rv) (mod ). 


Formula (7) now implies that 


pa(np* + rv) = pe(rv) Do palk)pe(n — k) (mod p), 


so that pe(mp? + rv) = p,(rv)p_.,(m) (mod p), which is just (10). 
As another application we prove 


THEOREM 3. For all odd n, 
(11) pis(53n" + = (n* —1)) =0. 


Proof. The proof is by induction on the total number of prime factors of n. 
For n = 1, (11) states that ~:;(53) = 0, which is actually the case (4). Suppose 
(11) proved for all integers with not more than ¢ prime factors. Let p be an 
odd prime. Then if ” has precisely ¢ prime factors, it will suffice to prove (11) 
for pn. Put 


a, = 53n” + 2(n* — 1). 
s 
Then 
aon = p'a, + 5 (h* — 1), 


and Theorem 1 implies (with r = 15) that p15(a,,) is linear in pis(a,) and 
Pis(Gnjp). Now pis(a,) vanishes by the induction hypothesis, and so does 








i1) 


ind 


oes 





COEFFICIENTS OF MODULAR FORMS 583 


Pis(Gnjp) if pin. If p fn, however, d@,,, is not an integer (since 429 is square- 
free) and so ~15(@,,,) vanishes in this instance as well. Thus :5(a,,) = 0 and 
the proof is complete. 


We now prove 


THEOREM 4. Suppose that a is such that for the mod m, p,(a) = 0 (mod m). 
Suppose further that 24a + r is square-free. Then 


(12) (an*® + =n" —1))=0 (mod m), 
where (n, 2) = 1 tf 3\r and (n, 6) = 1 otherwise. 


Proof. As in Theorem 3, the proof is by induction on the total number of 
prime factors of m. If m = 1, (12) states that p,(a) = 0 (mod m), which is 
true by hypothesis. Suppose (12) proved for all integers with not more than 
t prime factors. Let p be a prime such that » > 3 when (r,3) = l and p > 2 
otherwise. Then if m has precisely ¢ prime factors, it will suffice to prove (12) 
for pn. Put 


\, = an’ + =n" — 1). 


Then 
— Stee... 
Non = Pa + ah 1), 


and Theorem 1 implies that p,(A,,) is linear in p,(A,) and p,(Agjp). Now p,(Aq) 
= 0 (mod m) by hypothesis, and the same is true for p,(Aq/) if pin. If pin 
however, Aq» is not at integer since 24a + 7 is square-free, and so p,(Anjp) 
vanishes. Thus p,(A,,) = 0 (mod m) in either case, and the proof of Theorem 
4 is complete. 


Theorem 4 can be strengthened slightly by discarding the condition that 
24a + r be square-free and restricting m to be divisible only by primes p such 
that p > 2 when 3/r, p > 3 when (r, 3) = 1, and p? 4 24a + 7. 

If we choose r = 11,m = 13 anda = 6 we find from (4) that p,(a) = p1:(6) 
= —143 =0 (mod 13), while 24a +7 = 155 is square-free. Theorem 4 
applies and we have 


(13) Pi(6n? + s(n" — 1)) = 0 (mod 13), (n,6) = 1. 


Using formula (9) once again, we obtain the following interesting congruence 
for the partition function mod 13: 


(14) p(84n’ ~ aa (n" — 1)) = 0 (mod 13), (n,6) = 1. 
Formula (14) is a Ramanujan congruence for the partition function, with 


the difference that the terms form a quadratic, rather than an arithmetic, 
progression. 











584 MORRIS NEWMAN 


More generally, we have 


THEOREM 5. Suppose that p1:(a) = 0 (mod 13), and that 24a + 11 ts square- 
free. Then 


(15) pu(an® + 51(n* — 1)) = 0 (mod 13), (n, 6) = 1, 
(16) p((13a + 6)n® — 54 (n" ~ 1)) = 0 (mod 13), (n, 6) = 1. 


The first few admissible a’s are 6, 10, 17, 18, 24, 27, 57, 68, 69, 74, 90, 95. 
(This information is extracted from (4).) It is of interest to note that two 
progressions 


Jon? + Hey? — Df, aun? + Lin? »} 


or 


2 1,2 J 2 1,2 
{1301 + 6)n — 54(" = yt, ) (18a2 + 6)m — 54" »} 


have no integers in common, since 24a; + 11 and 24a, + 11 are square- 
free. 


III. In this section Table I gives p,(rv) for r odd, 5 < r < 23 and for 
3 < p < 23. We exclude r = 1, 3 from the table since p;(m), p3(m) are known 
explicitly. For p = 3 there is no entry unless r is a multiple of 3. Using Table 1 
we can construct Table II of values of c, and we do so for r odd, 5 < r < 23 
and for p = 3, 5,7. The values of p,(rv) were extracted from (4) and some 














TABLE Il 

\e 7 

~ 3 5 7 
5 -—6 16 
7 66 —176 
9 —12 —210 —1016 
11 —2694 3544 
13 11730 50008 
15 1836 3990 4 33432 
17 1 14810 30 34528 
19 —6 45150 —39 74432 
21 53028 —55 56930 444 96424 
23 232 45050 13229 77768 





unpublished tables in the author’s possession giving the first 1000 coefficients 
of p,(m) for r odd, 5 < r < 23. These were computed by means of a double 
precision program on the IBM 704 of the National Bureau of Standards in 
Washington, D.C. 





2) 
= 
& 
res 
x 
< 
= 
5 
> 
a 
2) 
n 
“ 
Z 
ij 
— 
2 
ics 
we 
i 
5 
o 


FRLEO 
68908 
eszi¢ 
LPOSF 
L9909 
Lg909 
TP8Z6 
64808 
IZ 


161 





6EEST YS9TZ— 
00098 LST — 
89690 O¢ 
L6ZL1 I 

O8SF 

1% 

a 


6906F OOZLI SSlEe 
TSOPL 62866 86 
O90LL 8688E ¢ 
SOLEP LOhOF— 
TZ9LL 1062 
eos. ¢¢o- 

S9LP9 L 

66L9F 

LItI—- 

I> 


61640 L9F— 
€L9ZE 09— 
LEPt I 
Z9L68 9166—- 
L9890 LPS 

SEs6z oO 

SlPst 

61961 

SZI 

601 


T10el 
LL9S 


660° 





LITbS 999FO FE 


GZLEL FOZSZ Z— 


£6908 6609F — 
ISthl IZ9I— 
Lg9it¢ 09-— 
99SZE ET 
LSZ9I — 

sIis— 

68h — 

1g 


L2Z61 OLEZTE 8— 
1PZLZ LE9GE— 
6FCLZ PRIL— 
6Z9F9 6E8— 
68L1¢ I 

69£0L £- 
O8S6Z 

£2FS — 

Igst— 

I 


61SZ0 
PZF96 
69£06 
S860I 
€8ZS1 
T0ZEE 
Srtl 
€Z9— 
9LI— 
6 


SOPOT 
PrP 
LI 

a 

€ 


SZP6L PET 
cOseO0 9E— 
c2sts Z— 
S899F 

0668 

S098 

6902 — 
s8—- 

If 

9 —_ 


Cress 





eI 











ATaVL 














586 MORRIS NEWMAN 


REFERENCES 


1. N. Newman, On the existence of identities for the coefficuents of certain modular forms, J. Lond. 
Math. Soc., 31 (1956), 350-359. 











2. , An tdentity for the coefficients of certain modular forms, J. Lond. Math. Soc., 30 
(1955), 488-493. 

3. , Remarks on some modular identities, Trans. Amer. Math. Soc., 73 (1952), 313-320. 

4. , A table of the coefficients of the powers of n(r), Proc. Kon. Nederl. Akad. Wetensch., 
Ser. A 59 = Indag. Math., 18 (1956), 204-216. 

5. 





, Congruences for the coefficients of modular forms and some new congruences for the 
partitson function, Can. J. Math., 9 (1957), 549-552. 

6. , Some theorems about p,(n), Can. J. Math., 9 (1957), 68-70. 

7. G. Pall, On the arsthmetic of quaternions, Trans. Amer. Math. Soc., 47 (1940), 487-500. 

8. H. Zuckerman, Identities analogous to Ramanujan's identities involving the partition function, 
Duke Math. J., & (1939), 88-110. 





National Bureau of Standards 
Washington, D. C. 








on, 





ON BOUNDED MATRICES WITH NON-NEGATIVE 
ELEMENTS 


C. R. PUTNAM 


1. Introduction. It is known (Perron (10); Frobenius (5, 6)) that if 
A = (ay) is a finite matrix with elements ay > 0, then A has a real, non- 
negative eigenvalue yu, satisfying ~» = max|A| where ) is in the spectrum of A, 
with a corresponding eigenvector x = (x;,...,%,) for which x, > 0. More- 
over if ay > 0, then yu is a simple point of the spectrum with an eigenvector 
x (unique, except for constant multiples) with components x, > 0. Much 
has been written on this and related issues; cf., for example, the recent papers 
(4, 12) wherein are given several references. Rutman and Krein (8, 11) have 
placed the problem in the general setting of operators in a Banach space 
leaving invariant certain cones. 

In the present paper, a Hilbert space consisting of real vectors x = (x), x2, 
...), and bounded operators, represented by real matrices A = (a), will be 
considered. Thus, for any such A, there exists a constant M > 0 such that 
||Ax|| < M||x|| whenever ||x||? = }-x,2 < ©. For this case, the Rutman- 
Krein results lead to certain theorems on completely continuous operators. 
The object of the present note is to obtain certain analogous results for 
operators which are not necessarily completely continuous. In fact, a series of 
theorems will be given, where, in the beginning (cf. (I) below) only the assump- 
tion a4 > 0 (and boundedness) will be made, and, as needed, additional 
restrictions will be imposed. 

The author is indebted to the referee for pointing out some recent work of 
Bonsall (1, 2, 3) which includes, among other things, generalizations of 
certain results of Rutman and Krein. Theorem (I) below is contained in 
(2; cf. pp. 148 ff.), also Theorem B of (3, p. 54). 


2. By A > Oand A > O will be meant that ay > 0 and ay > 0 respectively. 
Similarly a vector x = (x;, x2,...,) will be written x > 0 or x > 0 according 
as all x, > 0 or all x, > 0 respectively. The spectrum of A will be denoted by 
sp(A). The following will be proved: 


(I) If A > 0, then uw = sup|A|, where d is in sp(A), also belongs to sp(A). 

(II) If A > 0 and if at least one diagonal element, say d, of A® (for some 
nm > 1) is positive, then yu of (1) above satisfies up > d. 

(III) If A > 0 and if u of (1) is positive and is a pole of the resolvent R(X) = 
(A — XI) (hence, in particular, wu is an isolated point of sp(A) and is in the 


Received March 6, 1958. This research was supported by the United States Air Force under 
Contract No. AF 18 (603)-139. 


587 











588 Cc. R. PUTNAM 


point spectrum) then there exists at least one characteristic vector x (Ax = ux, 
x # 0) satisfying x > 0. 

(IV) If A >0O (or even, if for every pair i, k there exists an integer M = 
M(i, k) > 1 such that (A™)& > 0) and if u of (1) (where, by (II), u > 0) is a 
pole of R(A), then yu is (a) a simple pole of R(A) and (b) a simple characteristic 
number. Moreover, (c) there exists a characteristic vector x > 0 belonging to yw. 


Remarks. The above theorems are patterned after similar ones, in which A 
is supposed to be completely continuous, of (8, pp. 80-82). (Cf. also the last 
paragraph of §1 above.) Parts of some of the proofs (as noted below) are vir- 
tually identical with those in (8) but, in order to make the present paper self- 
contained, complete proofs of all the theorems will be given. 

In (I) and (II), where only A > 0 is assumed, it should be noted that 
u may not be in the point spectrum and that A may not have any point 
spectrum whatever. In fact, if A is the Jacobi matrix belonging to 2 }>x,%_4:, 
then A > 0 and A has no point spectrum. Actually, A can be chosen so as to 
satisfy A > 0, for instance, a bounded Toeplitz matrix with positive elements; 
then necessarily the point spectrum is empty (cf. 2, p. 149; 7, p. 868). 

In (III) and (IV), the assumption that u (>0) be a pole of R(A) is surely 
fulfilled if A is completely continuous. In fact, in this case, the above results 
(except possibly (a) of (IV)) are contained in the results of (8, pp. 80-82). 
Part (a) of (IV) does not seem to be contained here, although something 
similar to it (when A is completely continuous) is contained in Theorems 5 
(Rutman) and 5a (Krein) of (11, pp. 91-92). In these latter theorems however, 
it is assumed that the “invariant cone’”’ has an interior point. In the present 
case however, what corresponds to this cone is the set of vectors x > 0 in 
Hilbert space, and this set has no interior points. 

It can be remarked that the statement given in (8, p. 91), namely that if 
A is completely continuous, if A > 0, and if all the diagonal elements of every 
power A” are zero, then 0 is the only point of sp(A), may not be true without 
the assumption of complete continuity. (The proof given, loc. cit., pp. 91-92, 
involves the approximation of A by its sections.) 


3. Proof of (1). Since A is bounded its spectrum is contained in a finite 
portion of the complex plane, so that the resolvent R(A) is given by R(A) = 
— > A"*/x"*" for |A| sufficiently large. The elements of —R(A) are series of the 
form }-a,2" where z = \~' and a, > 0. If every one of these series is convergent 
for |z| arbitrarily large, then, of course, 0 is the only point of the spectrum of 
A. Otherwise, there exists a real number a > 0 such that every series is con- 
vergent for |z| < @ (that is, for |A| > 1/a) but at least one series is divergent 
for |z| > a. By the Vivanti-Pringsheim theorem (9, p. 72), z = a must be a 
singularity of such a series. Consequently the number A = 1/a is in sp(A) 
while every \ in sp(A) satisfies |A| < a. This completes the proof of (I).* 

*The above proof, using the Vivanti-Pringsheim theorem, is due to the late Professor Wintner, 
with whom the author had several valuable discussions concerning non-negative matrices. 











MATRICES WITH NON-NEGATIVE ELEMENTS 589 


4. Proof of (11). Let d denote the kth diagonal element of A*. Then the kth 
diagonal element of A™ is given by a series with non-negative terms, one 
of which is d". Hence, if \ is real and satisfies \ > wu, then the kth diagonal 
element of —R(A), which is not less than the kth diagonal element of 


> ar 


m=<1 
is not less than 


, fe. 


But, since this last series is divergent if \ < d’", it follows that » > d™. 
This completes the proof of (II). Cf. (8, pp. 68-69), wherein is given a similar 
proof for a completely continuous operator in a Banach space. 


5. Proof of (III). If h > 0 isa pole of R(A) then R(A) is given by 


RQ) = Dd ea(A — »)” 


n=—N 
for |A — | sufficiently small and positive, where NV > 1, c_y < Oand c_y ¥ 0. 
In fact, as was noted above, if u > 0, then some element of R(A) tends to 
— « when A — uw + O (A real). The remainder of the proof is essentially iden- 
tical with that of (8, p. 66). For 


(A — w)"I = (A — w)"(A — AD) RO) = (A — ADew + B, 
where B represents a term which tends to 0 as A tends to uw. Consequently 


(A — wl)c_y = 0 and a characteristic vector x > 0 of uw is given by any 
column of --c_y which does not consist entirely of zeros. 


6. Proof of (a) of (IV). It follows from the functional equation of the resolvent 
that dR/d\ = R*(A). Hence, if N(see above) satisfies N > 1, then, on equating 
coefficients in the series for dR/d\ and R*(\) one obtains c_y? = 0. If ¢ v= 
(cy), then this last result is that 


> CimCuz = O. 


In particular, if 7 = k, it follows that c,, = 0, for all = 1, 2,.... By hypo- 
thesis, for every pair (7, k) there exists a positive integer M = M(i, k) such 
that (A”), > 0. But A“c_y = w™c_y, and hence 


>» (A™) smCmn' = pence = (). 


Consequently, cj; = 0 (¢, k arbitrary) and so c_y = 0, a contradiction. This: 
completes the proof of (a). 


Proof of (c) of (IV). The proof in §5 shows that if some element c,, of the 
mth column of C is zero, then in fact every element Cim, Com, ... , of this column 














590 Cc. R. PUTNAM 


is zero. For, suppose Ci = 0; then for this fixed i, choose an arbitrary k and 
then M = M(i, k) as before. Then 


> (A™) iglom -_ pn Cm = 0 
Pp 


and hence cym = 0 (k = 1, 2,...,). Consequently, since c_y < 0 and c_y ¥ 0, 
it follows that there exists at least one column of —c_y consisting of positive 
elements only. This column serves as a positive characteristic vector and the 
proof of (c) is complete. 


Proof of (b) of (iV). The proof is essentially that given in (8, pp. 78-80, 82) 
for integral equations. First, let y be any characteristic vector of A*. Then 


Zz nie = BY: 
i 
and hence 
} Ax s|Ve| > ulys|. 
k 


Let x be a positive characteristic vector of A belonging to yw (see above). 
Multiplication by x, of both sides of the last inequality followed by a sum- 
mation and an interchange of the orders of summations, yields 


> xelye| > D> 1M 


where, since all x; > 0, the sign > (and hence a contradiction) obtains only 
if the components of y fail to satisfy either all y,; > 0 or all y, < 0. Thus if 
y is any characteristic vector of A belonging to u, then either y > 0 or y < 0. 
Interchanging the roles of A and A* (and noting that R*(A) is the resolvent of 
A* and that yu plays the same role for A* as it does for A) it follows that any 
characteristic vector z of A belonging to yu satisfies either z > 0 or z < 0. 
Consequently, uw is a simple point of the spectrum of A. Otherwise, there 
would exist a characteristic vector, say z (necessarily z > Oorz < 0) orthogonal 


to x(> 0) and this is clearly impossible. This completes the proof of (b) of 
(IV). 


REFERENCES 


1. F. F. Bonsall, Endomorphisms of partially ordered vector spaces, J. Lond. Math. Soc., 30 
(1955), 133-144. 








2. , Endomorphisms of a partially ordered vector space without order unit, J. Lond. 
Math. Soc., 30 (1955), 144-153. 
3. , Linear operators in complete positive cones, Proc. Lond. Math. Soc., Ser. 3, 8 


(1958), 53-75. 

4. A. Brauer, A new proof of theorems of Perron and Frobenius on non-negative matrices, |. 
Positive matrices, Duke Math. J., 24 (1957), 367-378. 

5. G. Frobenius, Ueber Matrizen aus positiven Elementen, Sitzungsberichte Preuss. Akad. 
Wiss. (Berlin, 1908), 471-476; (1909), 514-518. 








MATRICES WITH NON-NEGATIVE ELEMENTS 591 


Ueber Matrizen aus nicht negativen Elementen, Sitzungsberichte Preuss. Akad. 
Wiss. (Berlin, 1912), 456-477. 

. P. Hartman and W. Wintner, The spectra of Toeplitz's matrices, Amer. J. Math., 76 (1954), 
867-882. 

. M. G. Krein and M. A. Rutman, Linear operators leaving invariant a cone in a Banach 
space, Uspehi Matem, Nauk (N.S.) 3, no. 1 (23) (1948), 3-95; Amer. Math. Soc. Trans- 
lation No. 26 (page references in paper refer to this translation). 

9. E. Landau, Darstellung und Begriindung etniger neuerer Ergebnisse der Funktsonentheorie 
(Berlin, 1929). 

10. O. Perron, Zur Theorie der Matrices, Math. Ann., 64 (1907), 248-263. 

11. M. A. Rutman, Sur les opérateurs totalement continus linéaires laissant invariant un certain 
cone, Math. Sbornik (N.S.), 8, no. 1 (50) (1940), 77-96. 

12. H. Samelson, On the Perron-Frobenius theorem, Michigan Math. J., 4 (1957), 57-59. 


Purdue University 











ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC 
SERIES 


A. VAN DER SLUIS 


Introduction. In Part I of this paper we present a theory of Padé-approxi- 
mants for Laurent series, and discuss their relation to orthogonal polynomials. 
For earlier results in this direction we may refer to (1; 7; 8). It is also indicated 
how this theory can be extended, for example, to matrix polynomials. 

In order to derive certain special types of orthogonal polynomials we 
need explicit expressions for Padé-approximants. In Part II we generalize a 
result of Padé (5), giving such expressions in the hypergeometric case. The 
resulting polynomials are the classical ones and basic analogues of them. 
Concerning these analogues see also Hahn (2). 

In the final part it is proved that under a much more natural and apparently 


less restrictive condition no more general polynomials result than those 
obtained in Part II. 


PART I 


1. Orthogonal polynomials. Our definition of orthogonality will be 
similar to the generalized definition of Krall (3). Suppose we are given a 
sequence fo, 71, f2,..., in a field R, such that each set of equations 


[ore + qi"i + eee + Im? m — 0 
(1.1) are a = 1,2.3.... 
~~ + qi m + “+. + Im? 2m-1 = 0 
has exactly one solution with g, = 1 in R (in that case the sequence {r,} will 


be called regular). We then define a moment operator Q operating on polynomials 
as follows 


(1.2) 2> px" = >> py, 

p= p=0 
and the set of polynomials Qo(x), Q:(x), Qe(x),..., over R of respective 
degrees 0, 1,2,...,is called orthogonal with respect to the sequence fo, 11, 
Tan cee, if 
(1.3) 20,,(x)x" = 0 n < m, 


Received June 27, 1957; in revised form February 14, 1958. 

Parts I and II of this paper cover parts of a thesis submitted by the author to the University 
of Amsterdam (General orthogonal polynomials, Groningen, 1956). This paper was completed 
while the author held a fellowship at the Summer Research Institute of the Canadian Mathe- 
matical Congress, Kingston, 1957. The author is greatly indebted to Professors Dr. J. Popken 
(Amsterdam) and Dr. F. van der Blij (Utrecht) for valuable hints and kind assistance during 
the preparation of this paper. 


592 

















ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 593 


which is equivalent to 
(1.4) 20,(x)QOm(x) = 0 n mM. 


If R is the field of real numbers it is not true that each regular sequence 
{r,} can be obtained as a moment sequence of a non-negative distribution. 
Hence this notion of orthogonal polynomials is essentially more general than 
the usual one. 

If the polynomial 


Qn (x) = Dy qux", 


then from (1.3) it follows that the coefficients q, satisfy the equations (1.1); 
from the regularity of the sequence {r,} it follows that, apart from a constant 
factor, Q,, is uniquely determined. 

If the orthogonal polynomials are considered in their monic form, that is, 
the coefficient of the highest power of x is one, then for the norm of Q,,, de- 
fined as V,, = QQ,,?(x), we find that 


(1.5) Na = Go%m + Qifmei H+ + ImT 2m) m > 0, 
where again Q,,(x) = >> g,x*. 

These norms are not zero, for suppose NV, = 0, then the set (1.1) with 
m + 1 instead of m would have a non-trivial solution with ¢g,4: = 0. 


2. Padé-approximants. Consider the formal Laurent series 


D(x) = > px” 


b=——ca 

with coefficients in R (in the sequel we will always suppress the limits of the 
summation index it: these are — © and + ©). If m is any integer and m is a 
positive integer, the non-zero polynomial V,,,,(x) of degree <m will be called 
a (Padé-) denominator of D(x) for the place (m, n), if in the formal product 
Vin.n(x) D(x) the terms containing x"*', x"**,..., x"*" have zero coefficients; 
any non-zero constant V»,,(x) will be called a denominator of D(x) for the 
place (0, ). 

For each denominator V,,,,(x) we define a numerator U»»(x) as the series 
obtained from V,,,,(x) D(x) by cancelling all terms after the one containing 
x". Then the pair (Un.»(x), V.»(x)) will be called a (Padé-)approximant of 
D(x) for the place (m, n). 

We recall that if D(x) is a formal power series over the real numbers, then 
Umn(x) Vmn(x)—! is a Padé-fraction for the place (m,n) (cf., for example, 
Perron (6, §73)). Hence to any Padé-approximant of a real power series there 
corresponds a Padé-fraction. 

If k is an arbitrary, but fixed, integer, a seguence of approximants belonging 
to the places (m, m + k), m = 0,1, 2,..., will be called a diagonal of order k, 
denoted by D, (the words “‘place’’ and “diagonal” are of course derived 
from the notion of the Padé-table (6)). 








594 A. VAN DER SLUIS 


For each approximant (U,, (x), Vm..»(x)) we define an element p,,,, (which 
may be zero) by means of the relation 


(2.1) Vunn(x) D(x) — Unmn(x) = Pmn»x™*"*! + higher powers of x. 


We remark that for m > 0 the non-zero polynomial V,,,,(x) = gox™ +... 
m—1X + Gm is a denominator of >> p,x* for the place (m, m) if and only if 


GoPn—m+t +... + QmPati = 0 
Ce) nn, (ccc 


GoPn + “+. + GmPnim = 0 


Since the number of equations is always less than the number of unknowns, 
each place (m, m) has a Padé-approximant. 
For the ~,,, corresponding to this V,,,, we find 


(2.3) JoPn+1 + **.* + QmPn+-m+1 = Pana: 


If (U, V) and (U*, V*) are approximants for the same place, their sum, 
defined as (U + U*, V + V*), is again an approximant for that place, and a 
constant multiple p.(U, V), defined as (pU, pV), is an approximant for each 
place for which (U, V) is an approximant, except of course in the trivial case 
when this sum or multiple results in (0, 0). 


3. Regularity. The Padé-approximant (Un .»(x), Vmn»(x)) of the series 
D(x) for the place (m,n) will be called regular if (a) the constant term of 
Vinn(x) is 1, that is, V,,,(0) = 1, and (b) any other approximant for the 
place (m, m) is a constant multiple of (Un..», Vn.n)- 

Clearly condition (b) is equivalent to the condition that any other denomin- 
ator for the place (m, n) is a constant multiple of Vm». 

The denominator of a regular approximant will also be called regular. A 
set of approximants will be called regular if each of its elements is regular. 

For any place (m,n) there exists at most one regular approximant; 1 is a 
regular denominator for each place (0, ~) and for the corresponding po, we 
have pon = Pati if D(x) = L px. 


As an immediate consequence of our definition, we have 


THEOREM 3.1. The series D(x) = ¥ p,x* has a regular approximant for the 
place (m,n), m > 0, if and only if the set (2. 2), considered as equations in the 
du, has exactly one solution with qn = 1. 


If (Um, Vm.) is an approximant such that the corresponding p,,, = 0, 
then x(Unn, Vm), defined as (xUmn»,xVmn), is an approximant for the 
place (m + 1, m + 1), the constant term of the denominator of which is 0; 
hence it cannot be a constant multiple of a regular approximant. This proves 
the following: 


THEOREM 3.2. If D(x) has a regular diagonal D,, then none of the corres- 
ponding Pmm+z 1S Zero. 





_ 


eo = » 


~S 














ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 595 


If for a place (m, m) there exists no regular approximant, then there is an 
approximant for that place in which the denominator has constant term zero. 
For suppose that V,,,, is a denominator such that V,,,,(0) # 0. Then there is 
certainly a V*,,, that is no constant multiple of V,,,. Then V*,,.(x) — 
V*nn(0) Vinn(0)—! Vn.n(x) is not zero, is a denominator for the place (m, n) 
and has constant term zero. This will be used in proving the following theorem, 
which is more or less a converse of Theorem 3.2: 


THEOREM 3.3. Let the series D(x) have a diagonal D, consisting of approxi- 
mants (Um m+r» Vn.m+e) Such that the corresponding Pm m+ are all different from 
zero, and Vm m+x(0) = 1 for all m. Then D, is regular. 


Proof. The approximant (Uo,., Vox) has denominator 1, hence is regular. 
Suppose that for a certain m the approximant (Um mir, Vn.m+e) is regular, and 
moreover that for the place (m+ 1, m+ k-+ 1) there exists no regular 
approximant. Then there is an approximant (U*, V*) for this place such 
that V* (0) = 0, hence x~'(U*, V*) is an approximant for the place 
(m,m + k),and the corresponding p*» m+ is zero. However, since Pmmsx * 0, 
this approximant cannot be a constant multiple of (Um mie, Vn.m+e), contradict- 
ing the assumed regularity of the latter. Hence, if (Ummiz, Vn.m+e) is regular, 
there is a regular approximant for the place (m+ 1, m+k+ 1), and 
Vin+tom+e+1 iS its denominator since Vi+1.+20+1(0) = 1. Induction completes 
the proof. 


We remark that the proof of this theorem shows that if a series D(x) has a 
diagonal D, for which the corresponding Pmm+, are all ~ 0, then all the 
approximants are constant multiples of regular approximants. Thus the con- 
dition Vin.m+z(0) = 1 in Theorem 3.3 has only a normalizing effect, and does 
not essentially restrict the class of series, which have a regular diagonal 
D,, according to this theorem. 


4. Approximants of D(x*). In this section we give a theorem relating 
approximants of the series D(x?) to those of the series D(x). 


THEOREM 4.1. If the series D(x) = > p,x* has regular diagonals D, and 
Dysi consisting of approximants (Um n(x), Vnn(x)), then the series D*(x) = 
>“ p.x™* has a regular diagonal Dxy,+, consisting of approximants (U*» (x), 
V*n.n(x)) for which we have 
(4.1) Vim,2m¢2ee1(%) = Vinmee(X"),  Vimet,ame2e¢2(%) = Vinminei(%’), 


together with analogous relations for U. 
Moreover, if we put 


V2 .(x) D*(x) — U2 (x) = pox +... 
then 


* 
(4.2) P2m,2m+2k+1 = Pm.m+ks P2m+1, 2m+2e+2 = Pm.m+k+1- 








596 A. VAN DER SLUIS 


Proof. Consider any approximant (U(x), V(x)) of D(x*) for the place 
(2m + 1, 2m + 2k + 2); then 


V(x)D(x*) — U(x) = patmttees + qximtats Bre oc 


Now V(x) can be written as V,,’(x*) + x V,,’’(x*), where the subscript denotes 
the highest power of the polynomial variable. Similarly 


U(x) = Useress(x’) +x Uei..(x’). 
Then we have 


Vin(x*)D(x*) — Unsers(x”) = p xt +, 
x V2i(x") D(x) = xUe n(x?) = eters + ote 


or 
(4.3) Vin (x) D(x) -— Un+nsi(x) =p gmters 4 sees 
(4.4) Van(x) D(x) — Uni(x) = q x cna SOOT 


From the regularity of D,,, it follows that, apart from an arbitrary constant 
factor, there is exactly one V,,’(x) satisfying (4.3), viz. Vin.m+e41(x) and that 
P = Pmm+t+1 if Vn’ = Vin. m+h+1- 

On the other hand, from Theorem 3.2 and the regularity of D, it follows 
that the only V,,’’ (x) satisfying (4.4) is the zero polynomial. Hence all possible 
V(x) are constant multiples of Vin m+e+1(x?). Since, moreover, Vinm+e+1(0) = 1, 
it follows that V*ons1.om+2e.2(%) = Vm.m+eei(x”) is regular. It follows also 
that p*om+1,2+22+2 = Pmm+e+i- The remaining part of the theorem can be 
proved in a similar way. 

It may be shown that no series D(x"), with m > 2, can have a regular 
diagonal. 


5. Recurrence relations for approximants. 


THEOREM 5.1. Let the series D(x) have diagonals D, and D,+; consisting of 
approximants (Umn, Vm») such that D, is regular and Vm m+x41(0) = 1 for all 
m. Then we have 


r | r 
(5.1) Um+1,m+k+1 = Um, m+t+1 = Pm,m+k+1 Puma n.mtk 


ba —i ) 
(5.2) Vin+1 m+k+1 = Vin,m+k+1 = Pm,m+k+1 Pm.m+k Vin. m+k- 


Proof. The highest powers in the right-hand members of (5.1) and (5.2) 
are at most m+k-+1 and m-+1 respectively, whereas the right-hand 
side of (5.2) cannot be identically zero since it has constant term 1. Finally, 
multiplying the right-hand side of (5.2) by D(x) and subtracting the right- 
hand side of (5.1) shows that the right-hand members of (5.1) and (5.2) 
constitute an approximant for the place (m+ 1,m+k+ 1). Since D, is 
regular the relations (5.1) and (5.2) must be true. 


In a similar way one can prove 











ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 597 


THEOREM 5.2. Let the series D(x) have diagonals D, and D,., consisting of 
approximants (Umi, Vm») such that D,., is regular and Vm msx(0) = 1 for all 
m. Then we have 


afl 
(5.3) Vingt.meet2? = Vengim—ee1 — Pmti.mett Pm.meeel Vin.mek+l 


together with an analogous relation for U. 
Combining these two theorems we obtain 


THEOREM 5.3. Let the series D(x) have regular diagonals D, and D,+,; con- 
sisting of approximants (Umn, Vm»). Then for all m we have 


(5.4) Vin+1.m+k+1 = (1 = AmX) Vamte _— bmx? Vantimnnt 
where 

(5.5) an = Pm.m+k fo-1.808 + Pm.m+k+1 ae 

(5.6) bm = Pm.m+k emtent 


An analogous relation for U holds likewise. 


6. Orthogonality relations of approximants. Let D(x) = > p,x* have 
a regular diagonal D, consisting of approximants (Um mie, Vn,m+e)- If we write 
Vinsm+k = Gox™ + Qix™-! + ...+ Gm where the g, of course depend on m, 
then the coefficients satisfy the equation (2.2) with n = m + k. By the sub- 
stitution 


(6.1) Peri =o, Per2=11, Pers = 12,--- 
this system is transformed into (1.1), the condition for orthogonality. Since 


D, is regular, it follows that for each m > 0 and n = m + k the set (2.2) has 
exactly one solution with g, = 1, hence the same is true for the related 


system (1.1). Hence the polynomials go + gix +... + GmX™ = ©" Vin mon(X7') 
form a set of orthogonal polynomials with respect to the sequence P,41, 
Pe+2, Pers, ..., and are monic by virtue of Vn.mix(0) = 1. 


The same substitution (6.1) transforms (2.3) into (1.5). Hence the norm 
Of x" Vin m+z(X~') iS Pmim+k- This proves 


THEOREM 6.1. Let the series D(x) = > p,x* have a regular diagonal D, 
consisting of approximants (Ummiz, Vm.m+e); then the polynomials V,,(x) 
defined by 
(6.2) Ven(X) = 2" Vienne (X), m =0,1,2,... 


are monic, form an orthogonal set with respect to the sequence Peri, Pera, Prva, - + - 
and have norms Pm m+k- 


Since the sequence V» +x forms a diagonal, the V,,(x) will be called diagonal 
polynomials. 

If the sequence D(x) = > p,x* has regular diagonals D, and D,+:, then 
from Theorem 4.1 it follows that D*(x) = ¥ p,x™ has a regular diagonal 











598 A. VAN DER SLUIS 


D* 4,1, and from Theorem 6.1 and the relations (4.1) and (4.2) we then have 
the following: 


THEOREM 6.2. Let the series D(x) = >> p,x* have regular diagonals D, and 


Di+1 consisting of approximants (Umn»(x), Vm.n(x)); then the polynomials 
W,,(x) defined by 


(6.3) Wom (x) =x*™ Vamals*), W om+1(x) - etl tei (s*) 


are monic, form an orthogonal set with respect to the sequence pps1, 0, Pere, 0, 
Pers, 0,..., and have norms Q( Wom?) = Pmm+e, 2(Womer?) = Pmmee+i- 


These polynomials will be called stepline polynomials (since the sequence 
Vows Vower, Vie» Visi, ..., forms a stepline in the Padé-table). 
Combination of Theorems 6.1 and 5.3 gives 


THEOREM 6.3. Under the conditions of Theorem 6.2 the assertions of Theorem 
6.1 hold, and the polynomials V,,(x) satisfy the recurrence relation: 


(6.4) Vinsi(x) = (x — Gm) Vin(x) — bm Vin—i(X), m > l, 
where a, and b,, are given by (5.5) and (5.6). 
Combination of Theorems 6.2, 5.1, and 5.2 gives 


THEOREM 6.4. Under the conditions of Theorem 6.2 the assertions of Theorem 
6.2 hold, and the polynomials W,,(x) satisfy the recurrence relations 


(6.5) Won+i = X Wom — Pm.m+e Pa~t.ete Wom—1) m> 1, 
(6.6) Wom+2 =x Won+1 — Pm.m+k+1 | ay Wom: m 2 0. 


7. Extension to rings. Hitherto it has been assumed that we are working 
in a commutative field. It may now be pointed out that a similar theory exists 
if R is a not necessarily commutative ring with unit element. This is of im- 
portance when we consider matrix polynomials, that is, polynomials having 
matrix-coefficients. But in this case the orthogonality according to (1.3) is 
not complete since (1.4) is only true for m < m. However, if we take the 
matrices r, hermitian and for any two polynomials P(x) = > p,x*, Q(x) = 
> qx" define an “inner product” {P, Q} = ¥} pyrys.g*, then for any m # n 
we have {Q,,, Q,} = 0 instead of (1.4). This is quite natural in connection 
with complex valued orthogonal functions, where f(x) and g(x) are called 
orthogonal with respect to a real non-negative distribution dy(x) if 
J f(x)db(x) g(x) = 0. 

Returning to the general case where R is an arbitrary ring, we see that all 
notions in §§1-6 have a meaning. To render this true, we accept such rules 
as: the set of equations (1.1) has exactly one solution with g, = 1. However 
it is no longer true that to each place there corresponds an approximant. If 
we call a quantity 5 a right zero divisor if there is a c * 0 such that cb = 0, 
then the norms of a set of orthogonal polynomials are not even right zero 














ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 599 


divisors. In the definition of regularity (§3) a constant left-multiple is re- 
quired. Theorem 3.1 remains true; in Theorem 3.2 the py mse are not right 
zero divisors, and Theorem 3.3 remains true under the additional condition 
that the ~» m+ should not be right zero divisors. The final remark of §3 is no 
longer true. It may be remarked that the conditions of Theorem 3.3, with 
Pm.m+x not right zero divisors, are much weaker than those in the corresponding 
Theorem 6.3 of the author’s thesis. 

All the other theorems in Part I remain true, provided only that such 
expressions aS Pmimit Pm.m+k—1', if present, exist; here ab-' will be said to 
exist and be equal to c if } is not a right zero divisor and a = cb. 


PART II 


8. General hypergeometric series. Let us compare the definition of the 
ordinary hypergeometric series 





cory ae Se eet)... tee (6+u-1)-, 
(8.1) F(a, b;c;x) = 2d, ey STITT 


with that of the Heine series 
(8.2) H(a,b;c;x) = 
> (1 — a)(1 — ag)... (1 — ag”) ree Ri law) y 
— G@—c)\G— a). (-q*) -G—¢) * 


where g is not a root of ‘an. 
We remark that in both cases the coefficient of x" can be represented_as 


{a, O}f[a, 1].. le n — 1)(b, O)[d, 1]... . n — 1} 








~_ ic, Ofc, 1]... [c,» — Ife, 1]lv, 2]... [c, n] 

where 

(8.4) [s,k] = st+k, o = 0, 

or 

(8.5) [s, k] = 1 — sq", o= 1. 
In both cases we have 

(8.6) [o,0] = 

(8.7) |o, hk} + 0 for allh # 0 

and 

(8.8) [s,k + h] = gi(h)[s, k] + golh 


for all h, k and s, where g:(h) and g2(h) are functions of h only. 

To obtain a unified treatment of hypergeometric and Heine series we shall 
assume that [s, k] is any function with values in a commutative field R which 
is defined for all elements s of a set S (that need not consist of elements of R) 
and al! integers k. Assume also that [s, k] satisfies (8.6) and (8.7) for a certain 


r 


element o € S, and satisfies (8.8) for all s € S and all integers h and k. 











600 A. VAN DER SLUIS 


By (8.6) and (8.7), the condition (8.8) is equivalent to 


1 [a, k] fa, k + hy] 
(8.9) 1 [b, 1] [6, 2+ h]| = 0 for all a, 6, c € S and all h, k, 1, m. 
1 [c, m] 2. +l 


In fact, if (8.8) holds then the columns in (8.9) are linearly dependent. 


Conversely, if in (8.9) we substitute ad = s,b =c=o,m=0,1= —h#¥0 
we obtain 
(8.10) [s,k +h] = —[c, h]lo, —h]—[s, k] + [o, A], 


which clearly has the form (8.8). It follows that for 4 # 0 we have 
gi(h) = —[o, h][o, —h)-', g2(h) = [o, hi], 


whereas g:(0) = 1, g2(0) = 0. 
To simplify our notation, we define 


(8.11) [s, Rlo = 1 

for all k and s, 

(8.12) [s, Rl, = [s, R][s,k +1]... [s, 2 +4 — 1] 
for all h > 0, k and s 

(8.13) [s, Rl» = [s,k — 1}-'... [s,k — kA] 


for all h > 0, k and s for which the right-hand member is defined. 

Then we have 
(8.14) [s, Rk)» - [s, Rk] m [s, k + m]|n—m 
for all h, k, m and s for which the right-hand member is defined. 

The series 

+ : ey = HS LaF leld, De 
(8.15) F({a, k], [b, 1]; [c, m];x) = 2d, [., salle, fl, x, 
which is defined if [c, m + u] + 0 for all » >O0, will be called the general 
hypergeometric series. 

If [s, R] and o are given by (8.4) or (8.5), the general hypergeometric series 
is an ordinary hypergeometric or Heine series, respectively. Since [c, m + yu] 
~ 0 for all u > 0, we have the formal identity 
(8.16) F({a, k], [b, 1); lc, m]; x) — F([a, k + 1), (0, 2); [c, m + 1); x) + 

[c, m]2—"[b, 1] {[c, m] — [a, k]} F(a, k + 1), [6,2 + 1]; [c, m + 2]; x) = 0. 


For the constant terms cancel, whereas the coefficient of x*, u > 0, is equal to 


[c, m)n+1 [o, ij," [a, k + 1),~1 [b, I, 
{[a, k][c, m + uw] — fa, k + llc, m] + ([¢ m] — [a, k})[o, u)}, 
which is zero by virtue of (8.10). 














ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 601 


In a similar way it can be verified that 
(8.17) F({a,k + 1], [6,2]; [c, m];x) — F(a, k], [(b, 2 + 1]; [c, m]; x) + 
[c, m)—"{[a, k) — [b, 1} F({a, k + 1}, [6,1 + 1]; [c, m + 1}; x) = 0 
if [c, m + uw] ¥ 0 for all » > 0. 


9. The functions [s, &]. In this section we investigate the functions [s, &] 
satisfying equations (8.6) to (8.8). 


If we put 
(9.1) —[¢, 1][¢, —1]"' = q 
then from (8.10) for h = 1, s = a, it follows that 
(9.2) [o,k + 1] = glo, k] + [o, 1]. 


Hence by induction 


(9.3) [o, k+h] = Q[co,k] + (PoO+7?%+...+¢4+Dl[e,1]) fork > 0. 
Putting k = 0 in (9.3) we obtain 


(9.4) [c,h] = (g*"' + gq? 4+...¢+4 1)[e, 1] forh > 0; 
putting k = —hin (9.3) we obtain 
(9.5) [o, —h] = —g"("' +g? +... +¢4+ Die, 1] for h > 0. 


Conversely, the relations (9.4) and (9.5) define the function [¢, h] if [¢, 1] 
and g are given. These elements can be chosen quite arbitrarily. We formulate 
this as a theorem: 


THEOREM 9.1. If [o, 1] and gq # 0 are given elements, then the function [c, h] 
defined by (9.4) and (9.5) satisfies (8.6) and (8.8) for s = o, and all h, k. In order 
that this function satisfy (8.7) also, [c, 1] should be chosen # 0 and q should not 
be a root of unity different from 1. 


Another consequence of (9.4) and (9.5) is that 


(9.6) [o, —h] = —q~"{o, h] for all h. 
Substituting this result in (8.10) we obtain 

(9.7) [s, k + h] = q"[s, k] + [o, A). 
Hence 

(9.8) [a,k + h] — [c, 1 + h] = q*{[a, k] — [c, Z]}. 


From (9.7) we have the special case 
(9.9)  [s, h] = g"[s, 0] + [o, A], 


hence the function [s, h] is determined as soon as the functions [s,0] and 
[c, h] are known (q being defined by (9.1)). This is formulated in the following 
theorem, which is readily verified: 











602 A. VAN DER SLUIS 


THEOREM 9.2. If [c,h] is any function satisfying conditions (8.6) to (8.8) 
for s = «, and all h, k, and [s, 0} is any function defined for all s in some set S 
(of course [c, 0] = 0), then the function |s, h| defined by (9.9) satisfies (8.8) 
(and of course (8.6) and (8.7)) for all s, h, k. 


This theorem enables us to extend the domain S of a function [s, 4] satisfy- 
ing equations (8.6) to (8.8): suppose we add an element a to S and take for 
[a, 0] an arbitrary element of R. Then, if we put [a, k] = g*{a, 0] + [e, R] in 
accordance with (9.9), it follows that the extended function [s, R] still satisfies 
equations (8.6) to (8.8). 

We shall apply this in particular by first extending the field R, mentioned 
in the definition of the function [s, k] (cf. §8), to the field R(y) of rational 
functions in the variable y over R. Then we add the element y to S and define 
[y, 0] = y, hence 
(9.10) Ly, k] = g*y + [o, k] 


(cf. §10). From (9.7) with (c, m, k) instead of (s, k, hk) it follows that, if in 
the right-hand member of (9.10) we substitute [c,m] for y = [y, 0], then 
ly, k] is replaced by [c, k + m]. Hence, if in [y, 0],, considered as a polynomial 
in y, we substitute [c, m] for y, this expression is transformed into [c, m],. 

In later sections we will add y to S, but take [y, 0] = y~', hence 


(9.11) ly, k] = q*y—! + [e, Rl]. 


From Theorems 9.1 and 9.2 it follows that the class of all possible functions 
[s, A] is very limited. Actually (9.4) and (9.5) simplify to 


(9.12) [o, k] = [o, 1].& ie = I, 
k 
(9.13) be, Af o bp, 21 er fq #1. 
Hence: 


THEOREM 9.3. The function [s, k] satisfies 


(9.14) [s, k] = [s,0] + [¢, 1]. ifg =1, 
or 
(9.15) [s, k] = q*Is, 0] + [o, 1] ;—*£ ifq #1 


(in the latter case q cannot be a root of unity). 


From this theorem it follows that if g = 1, the series (8.15) always is an 
ordinary hypergeometrical series, whereas if g ~ 1 the series is a Heine 
series. However, as we wish to treat the hypergeometric and Heine series 
simultaneously, we shall not make use of this theorem in the present part. 
It has, however, some importance in connection with Part III. 


10. Hypergeometric polynomials. Since [c,0] = 0 it follows that 
[o, —n], = O for u > n > O. Hence F([c, —n], [d, 1]; [c, m]; x) is a polynomial 





ns 


al 





ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 603 


of degree < n. These polynomials will be called general hypergeometric 
polynomials. 
In the following we shall consider expressions such as 


(10.1) [c, m], F([o, —n], (0, 1]; [c, m]; x), n> 0. 


If (c, m + yu] has not an inverse for all u > 0, we can still give a meaning 
to (10.1) by extending R to R(y), defining [y, k] by (9.10), and substituting 
[c, m] for y in 

ly, O], F([o, —m], (0, 2]; [y, 0]; x). 


This definition is consistent with the usual meaning of (10.1) if [c, m + yu] 
has an inverse for all u > 0. 
For polynomials of type (10.1) we have the following generalization of 

(8.17): 
(10.2) [c, m], F({o, —n + 1], [b, 2]; [c, m]; x) 

— [c, m], F([o, —n], [b, 2 + 1]; [c, m]; x) 

+ ([o, —n] — [b, 1)) [c, m + 1),-1F (lo, —m + 1), [6,2 + 1); fe, m + 1); x) 

= (), 


which holds for all 6 and c, m > 0, m and 1. 
Let S’ be the subset of S consisting of those elements s, to each of which 
there corresponds a uniquely determined elemert s’ € S such that for each k 


(10.3) [s’, —k] = f(s) g~*[s, ], 


where f(s) does not depend on k. 
Then it is only a matter of calculation to show that, if a,c€ S’, m>0 
and the left-hand member exists, 


(10.4) x" F({o, —m], (a, k}; [c, ];%*) = [a, kine, a'g tt *(—)" 
x F([o, —m\, a. —] — + 1]; [a’, — k - oo + 1]; x f(a) f(c)~ ” ote 
The set S’ is not empty, since (10.3) holds for s = s’ = a, f(¢) = —1. 


Actually, it is easy to see that under some circumstances S’ may contain 
more than one element of S, or may coincide with S. 


11. Padé’s theorem. In this section we shall formulate and prove a 
generalization of a result of Padé (5). In the following, j denotes an arbitrary 
but fixed integer which may be — ©. 


THEOREM 11.1. Let the series 


(11.1) F(x) = . fet 
be defined. If forO <m<n—j +1 we put 

, = (1) 2 = MF Mm jmim—1)m 
(11.2) Vin.n(X) = ( 1) [c, Nm q x 


x F({o, —m], [c, 2]; [a, n — m + 1]; x79), 











604 A. VAN DER SLUIS 


then Vm»(x) is a Padé denominator of F(x) for the place (m,n). Moreover, if 
we put 


— 1% Umla, Olnt1 mn a a 
(11.3) pms aa Ae q’'\{[c,0] — [a, O}} ... {[c, m — 1] — [a, 0)} 
and 
(11.4) Fan(x) = F({a,n + 1], [o, m + 1]; [c, m+ + 1]; x), 
then there exists a series Um _(x) in which the highest exponent of x is < n, such 
that 


(11.5.2) Venn(%) F(x) = Unn(X) + Pox ™*t! Fa n(x). 

Proof. The existence of the series (11.1) implies the existence of {[c, 2],,}~' 
in (11.2), and then from §10 it follows that V,,,,(x) is defined and represents 
a polynomial of degree < m. It also follows from the existence of (11.1) that 
Pun» and F,, are defined. Then (11.5,,,.) implies that V,,,,(x) is a Padé- 
denominator of F(x) for the place (m,n). The proof of (11.5,,,.) will be per- 
formed by induction. 

Firstly, (11.59,,) is true for all nm > 7 — 1. 

From (10.2) with (c,a, m,n — 1, — m + 1, x~'g) instead of (6, c, n, 1, m, 
x) it follows after some calculations that 
(11.6n.n) Vast) = Vee~a al) = Im .nX Va~00-0>. 
where gu. = [c,m + m — 2],~' [a, n][c, m — 1]g™", and these q,,,, satisfy 
(11.7 m,n) Pu—1,2-1 Ym .n = Pm—1,n: 

From (8.16) it follows after some calculations that 
(11.852) Fn—1,n—1(X) = a~t.o(®) — fnakan\s) 
where, by (9.8), 

Tmn = [c,m +n — 12" [o, m] {[c, m — 1] — [a, 0}} 9”, 
and these rp, satisfy 
(11.9: ») Pm-1,n 'nn = Pm.n: 

Now suppose that for certain integers m and m the relations (11.5,—1,.—1) 

and (11.5,-1..) have already been proved. Then using successively (11.6,.,,), 


(11.510), (11.5m-1n-1), (11-7mn), (11-82) and (11.9,.,), and putting 
Umn(x%) = Um-in(*) — ImnX Um-i.n—1(x) we obtain 


Vin.n(x) F(x) 


ll 


Vin—1.n(X) F(x) — Qm.n® Vin—1,n-1(x) F(x) 
Un—r.n(%) + Pm—1.ne” ” Fn—i.n(*) — Qm.nX Um—1.n—1(x) 
— Gm.nPm—1.n- ” ats Fin—1,n—1(X) 
Un-1.0(%) — Um.nX Un —1,n- (x) 
+ a aoe (Fa—1.0(%) — Fa—1,.-1(2)) 
Um.n(x) + Pm—-1.n Tm.n -_" F n(X) 
Un.a(x) + Pant Fain(X). 





) 











ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 605 


Thus (11.5,,,) follows from (11.5 ,_-1.-1) and (11.5,-1..). Since (11.59,) is 
satisfied for all mn > j — 1, it follows that (11.5,,,,) is true for0 < m <n — j 
+ 1. 


Remarks. I. From (11.2) it follows that V,,,,(0) = 1. 
II. Padé’s original theorem (cf. (5)) gives denominators for a certain class 
of hypergeometric series. This corresponds to the special case j = 0 of our 
theorem: we then have F(x) = F({a, 0], [o, 1]; [c,0]; x). For every finite j 
we have 


(11.10) F(x) = [a, 0),[c, 0}7'x’ F({a, J], (o, 1]; [c, j]; x). 
III. If a, c€ S’ (cf. §10), then from (11.2) and (10.5) it follows that 
(11.11) Vin.n(x) = F([o, —m], [a’, —n]; [c’, —m — n + 1]; xgf(c)f(a)—"). 


In particular we see that the ordinary hypergeometric polynomial F(—m, 
—a — nn; —c — m — n+ 1; x) is a Padé-denominator of the ordinary hyper- 
geometric series F(a, 1; c; x) for the place (m, n) if m < n + 1. This is Padé’s 
original theorem. We also see that the Heine polynomial (for notation cf. 
(8.2)) H(q-", a~'q-"; c~'g-™-"*';, xac—'q) is a Padé-denominator of the Heine 
series H(a, q; c; x) for the place (m,n) ifm <q n+ 1. 


12. Confluent series. Consider the rational function field R(y), and, as in 
(9.11), let [y,0] = y~' and [y, k] = g*y~' + [c, k]. This expression has an 
inverse in R(y) for every k. Then the coefficient of x* in the series F({a, k], 
[b, 2]; [y, 0]; xy") is 


(a, k), (5, Yak (g@ + [o, lly)... (@-? + [o, w — I)y)[o, 1],}-'. 


The substitution y = 0 in this expression gives a meaningful result. By this 
substitution we obtain from the given series the series 


2Fo({a, R), (6, 0); x1) = So [2 BlelOs Te (woo yy 
p=0 Io, 1), 
In a similar way we obtain by putting y = 0 in F({a, k], [y, 0]; [c, m]; xy) 
the series 
my . = = a _f{a, k), _ Ds 
iF; ([a, k}; [c, m]; x) _— 2, (c, m],{e, 1], 2 x. 
The series 2F» and ,F; will be called confluent series of the first kind. It is also 
easily verified that y = 0 in F({a, k], [6,1]; [y, m]; xy~') gives 2Fo((a, k], 
[b, 1]; xq-™), and that y = O in F({a, Rk], [y,/]; [c, m]; xy) gives 1F1((a, &]; 
[c, m]; xq"). 

If g * 1 then for a special value of one of the parameters in F((a, &], (6, 1); 
[b, i]; [c, m]; x) we get series which are intimately connected with these con- 
fluent series of the first kind. In fact let [f,0] = [¢,1] (1 — q)~'. Then 
[¢, w] = [¢, 0] for all uw; hence 
: = fa, k),[d, 1), » 

F(a, B), (0, Hs Ef, ml; x1g, 0]) = Ye -Flelbs te 


p= [o, 1], 











606 A. VAN DER SLUIS 


This leads us to the definition of the confluent series of the second kind: 


2Fo((a, k], (6, 4]; x) = F({a, k], (5, 2); [¢, 0); x{¢, 0), 

iFi(a, k]; [c, m]; x) = F((a, k], [¢, 0]; [c, m]; x[¢, 07"), 

iFo({a, k]; x) = iF i({a, k]; (¢, 0]; x[t, O}) = 2Fo([a, k), [y, 0]; xy) yo, 

Fo’ ({a, kl; x) = 2Fo({a, R], (¢, 0]; x{¢, 0}-?) = sFi({a, k); [y, 0]; xy) yao. 


The initial assumption g # 1 can be avoided by considering g as a variable 
over R, in which case the defining expressions above for the four types of 
confluent series of the second kind certainly have a sense, and allow sub- 
stitution of any value for g which satisfies the conditions in Theorem 9.1. If 
q = 1 we again obtain the former confluent series. 

The close relationship between the series of different kinds arises from 
(10.4). In fact, if we apply confluence of any kind to (10.4), then on the left- 
and right-hand sides confluent series of opposite kinds appear. 

From Theorem 11.1 it is easy to deduce corresponding theorems for the 
confluent series. If we replace x by xy~', take c = y, where [y, wu] = gy 
+ [c, k] and put y = 0, we obtain the following: 


THEOREM 12.1. Let the series F’ (x) be defined by 


F(x) = 20 [a, OL. *O Px" 

= 
for some integer j. If for0 < m <n — j7 + 1 we put 
Vi.n(x) = (—1)"[a, 2 — m+ Inqr’x"1Fi([o, —m)]; [a, n — m + 1]; x7"q"*’) 
.. Io, 1) n{a, Oe ates 
Fr.n(x) = 2Fo({a, + 1), [o, m + 1); xq’) 
then there exists a series Um»' (x) of degree < n such that 

Vinn 2) F'() = Usnnl®) + Pat Fn a(2). 


Similar results can be obtained for all the other cases. 


13. Generalized classical orthogonal polynomials. |. If the general 
hypergeometric series has regular diagonals, then from Theorem 11.1 and 
Theorems 6.1 and 6.2 we can derive explicit forms for the corresponding 
orthogonal polynomials. 

Now, for all m, n such that 0 < m < n — j + 1, the V,,,, given by (11.2) 
have V,,,,(0) = 1, and hence, by virtue of Theorem 3.3 those V,, m4, will 
constitute a regular diagonal if the corresponding p».m4, are different from 
zero; by virtue of Theorem 3.2 this condition is also necessary, and then from 
(11.3) it follows that [a, u] and {[c, un] — [a, 0]} should be different from zero 
for all » > 0. However, this condition is independent of k, and hence all 
diagonals of order k, k > j — 1, are regular at the same time. Hence Theorems 
6.3 and 6.4 are applicable if this condition is satisfied. From Theorem 6.3 we 
obtain 








= tPF 


7” 


— 





ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 607 


THEOREM 13.1. Let k be an arbitrary integer; let (a, wu), [c, wu] and {[c, u] — 
[a, 0)} be different from zero for all u > 0 and let the sequence (a, O)x+:(c, O|e4.7 
(a, Olesolc, Olero', ... , be defined. Then the polynomials 


Vn(x) = (—1)"(a,k + Inlc,m + kp ge” F([o, —m], [c, m +k]; (a, + 1];xq) 
are monic, orthogonal with respect to the sequence and have norms 


Pmm+k = [o, L)m[@, O)m+esi{[C, Olamserile, m + R)m}—'{[c,0] — [a,0]}... 
. {[c, m — 1] — [a, 0}} gu. 


Moreover we have the recurrence relation 
Vin+1(%) = (x _ am) Vin (x) — bm Vin—1(X) 


where 


_ le, mj {[c, m — 1] — [a, 0}} — 4 [a.m + k + i[c, m + k) om 
[c, 2m +k — l]e q [c, 2m + ks q 


and 
) 12. m - + k\{c, m + k — 1)}{[c, m — 1] — [a, O}} 


Ic, 2m + k — 2jalc, | 2m +k — 1} 
If [s, Rk] is given by (8.4), and we take k = —1, we obtain for V,,(x) 


a(a+1)...(a+m-— 1) 

G+mu-—1). .€+am—-2) ED > SPOR 
which is easily seen to be a Jacobi polynomial (9), though in a different 
notation. For other values of k we get the same set of polynomials. 

Hence, if [s, Rk] is given by (8.5), we get Heine-analogues of the Jacobi 
polynomials. 

There is a companion theorem to Theorem 13.1 corresponding to Theorem 
6.4. This leads to the stepline polynomials 


m (a, k + 1)m 4m(m—1) 
[c, m + kln® 


m[c,m+k+ 
(a, k + 2m 


bn = [o, m 


Vn(x) = (—1)” 


Wom (x) = (—1) F({o, —m)], [c, m + k]; [a, k + 1); x'q), 


= Loe Gimim— "~F({o, —m], [c, m + k + 1]; 


(a, k + 2];x"q), 
orthogonal with respect to [a, O]esilc, O]esi7', 0, [a, Olesele, O]en27', 0 


which in the case of ordinary hypergeometric series reduce to ultraspherical 
polynomials. Recurrence relations and norms can be deduced from Theorems 
6.4 and 11.1. The well-known relation between Jacobi and ultraspherical 
polynomials actually originates in (6.2) and (6.3): if for k = —1 we denote 
Vin(x) by Jm (a, c; x) and W,, (x) by P,,(a,c;x), then Po, (a,c;x) = Jn (a,c; x"), 
Pomsi(a, c; x) = xJn(a + 1, c + 1; x*) (cf., for example, 9, (4.1.5)). The same 
is true for the relation between Laguerre and Hermite polynomials (cf. 9, 
(5.6.1) ). 


Wom+i(x) = (—1) 











608 A. VAN DER SLUIS 


II. In a similar way we obtain from Theorem 12.1 as diagonal polynomials 
Vax) = (—1)"[a, k + Wags Fi([o, —m)]; [a, k + 1); xqg"***"), 
which are orthogonal with respect to 
(a, O)e+s q7 +0) Ia, ee [a, i sey 


if [a, w] + O for all » > O and if the moment sequence is defined. They are 
generalized Laguerre polynomials, and the corresponding stepline polynomials 
are generalized Hermite polynomials. 


III. Applying Theorem 6.1 to the theorem resulting from Theorem 11.1 by 
substitution of ¢ for c, x[f, 0] for x(cf. §12), we obtain the polynomials 
V(x) = (—1)"[a, k + Lng Fillo, —m); [a, k + 1]; xq), 


which are orthogonal with respect to [a, OJes:, [@, Olese, [@, Oless,..., if 
[a, un] ~ 0 for » > 0, a  £ and the moment-sequence is defined. They are 
also generalized Laguerre polynomials, and the corresponding stepline poly- 
nomials are likewise generalized Hermite polynomials. 


IV. If in Theorem 11.1 we replace [a, k] by g*y~' + [c, k], x by xy and put 
y = 0 we obtain 
Vin(x) = (—1)"[c, m + kh qe 2Fo([o, —m], [c, m + k]; xq“). 
These are orthogonal with respect to 
gt Ic, 0}z},, qgharDarnr, Olz2s, qhttDarare Olzis, 7 


if the moment-sequence is defined. These polynomials are generalizations of 
the Bessel polynomials, introduced by Krall and Frink (3). 


*»* 


V. If in Theorem 11.1 we substitute ¢ for a, x[¢, 0]—' for x, we obtain 
Vn(x) = (—1)"[c, m + k]n'q?™.Fo({o, —m)], [c, m + k]; xq) 


orthogonal with respect to [c, O]ei7', [c, O]ere7!, [c, Oleas', ..., if c # ¢ and 
the moment sequence is defined. These polynomials are also generalizations of 
the Bessel polynomials. 


VI. If to the Padé-theorem obtained in III we apply the substitutions 
described in IV, or if to the Padé-theorem obtained in IV we apply the sub- 
stitutions described in III we obtain in either case 

** - 
Vin (x) = (—1)"q"""*” Fo ([o, —m]; xq 9 

These polynomials are in fact the Stieltjes-Wigert polynomials, and are 

orthogonal with respect to g#*@+)), gi@+n@+, gi@+2@+s) if g #1. 


VII. If to the Padé-theorem obtained in II we apply the substitutions des- 
cribed in V or if to the Padé-theorem obtained in V we apply the substitutions 
described in II, we obtain in each case 


Vin (x) _ (—1)"g *"™ Fe (Ie, —m); xq"***"). 








ire 
ils 


by 
if 
ire 


ly- 


ut 


of 


ind 
; of 


ons 


ub- 


are 


les- 
ons 





ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 609 


These are also Stieltjes-Wigert polynomials, and in fact are the same as 
those in VI but with q~' replacing g. Hence they are orthogonal with respect to 
qute+, ght) +2), gq ht+2) +3), rao gt aes, 

The expressions for the stepline polynomials in cases II to VII can be found 
from those for the diagonal polynomials. And also the recurrence relations and 


norms in cases II to VII and for the corresponding stepline polynomials can 
easily be deduced as in I. 


PART III 


14. Remarks concerning further generalization. The result of Part II 
being a generalization of the classical orthogonal polynomials, we note that 
the generalization of the hypergeometric series and consequently that of the 
classical orthogonal polynomials is strongly restricted by the rather un- 
natural-looking condition (8.8). This condition, it appears, has been intro- 
duced mainly to establish the recurrence relations (8.16) and (8.17), which 
are consequences of one another, and play an essential role in the proof of 
the key theorem 11.1 as they render the induction possible. It would therefore 
be much more natural to require, instead of (8.8), the existence of a relation 
of type (8.16). In view of (6.3), (11.6), and (11.8) the condition 


(14.1) F(a, Rk], [b, 2; fe, m]; x) — F(fa, k + 1), [b, 2); [e, m + 1]; x) + 
+ (a, b,c; k,l, m)x F({a, k + 1], [6,2 + 1]; [c, m + 2]; x) = 0, 


where y is a suitably chosen function, looks very natural and general. In the 
following sections, however, it will be shown that the class of series satisfying 
(8.1), (8.2), and (14.1) is not essentially more general than that considered 
in Part II, in the sense that it does not give rise to a more general class of 
orthogonal polynomials. 

From (14.1) it follows (cf. the proof of (8.16)) that 


(14.2) [a, k][c, m + w] — [a, k + w][c, m] 
+ (a, b,c; k,l, m)[c, m)[c, m + 1][d, ]—*[o, wu] = 0 
for wu = 1,2,3,.... This can be written as 
(14.3) [a,k + u)[c, m] — [a, k][c, m + w] = x(a, c; k, m)[o, u],u = 1,2,3,... 


and hence the first problem is to find the functions [s, k] and x satisfying (14.3). 
We shall solve this problem in the next section. To avoid unessential difficul- 
ties we shall assume that R is algebraically closed. Since in the following a 
and c will be considered constant, we put [a, k] = f(R), [c, k] = g(k), [c, R] 
= h(k), x(a, c; k, m) = x(k, m). Then the difference equation (14.3) becomes 


(14.4) f(k + uw) g(m) — f(k) g(m + wu) = x(k, m) h(u) w= 1,2,3,.... 


15. Solution of the difference equation. We shall first prove the 
following: 














610 A. VAN DER SLUIS 


THEOREM 15.1. The functions f(k) and g(k) satisfying (14.4), mone of them 
being identically zero, satisfy the same trinomial linear recurrence relation with 
constant coefficients not all zero, that is, 


ff(k +2) + qf(k +1) +rf(k)'=0 
(g(k +2) +a e(k +1) +rg(k) =0 


Proof. First consider the case when f and g satisfy the relation 
(15.2) f(k + 1) g(m) — f(k) g(m+ 1) = 0 for all k and m 


(this occurs, for example, if x(k, m) = 0). There exist ky and mp» such that 
FS (Ro) # 0, g(mo) ¥ 0; hence, putting 


f(Ro + 1) f(Ro)' = g(mo + 1) g(m)— = d, 


(15.1) for all k. 


we have 
(15.3) f(kR +1) — pf(k) =0 for all k 
(15.4) g(m +1) — pg(m) = 0 for all m, 


which are identical binomial linear recurrence relations. 
Hence we may confine ourselves to the case when x(k, m) # 0, and without 
loss of generality we may assume x(0,0) # 0. 
From (14.4) with k = 0, m = 0, we have 
(15.5) h(u) = x(0, 0)-' {f(%) gO) — f(O) g(u)}, 
hence 
(15.6) f(k + u)g(m) — f(k) g(m + pw) 
= x(k, m) x(0, 0)—' {f(u)g(0) —f(O)g(u)}. 
Substituting » = 1 resp. u = 2 we obtain 
(15.7) f(k + 1)g(m) — f(k)g(m + 1) 
= x(k, m) x(0, 0)—* {f(1)g(0) —f(O)g(1)}, 
(15.8) f(k + 2)g(m) — f(k)g(m + 2) 
= x(k, m) x (0, 0)—* {f(2)g(0) —f(0)g(2)}. 


Elimination of x(k, m) x(0, 0)—! from (15.7) and (15.8) gives 
(15.9) {f(& + 2)g(m) — f(k)g(m + 2)}{f(1)g(0) — FO)g(1)} 
= {f(k + l)g(m) — f(k)g(m + 1)}{ f(2)g(0) — f(O)g(2)}. 
Substituting m = 0 in (15.9) and rearranging: 
(15.10) f(k + 2)g(0){f(1)g(0) — fO)g(4)} 
— f(k + 1)g(0){f(2)g(0) — f(O)g(2)} + f(e)g() {g(1)f(2) — g(2)f(1)} = 
Substituting k = 0 in (15.9) and rearranging: 


(15.11) g(m + 2)f(0) {f(1)g(O) — fO)g(1)} 
— g(m + 1)f(0). {f(2)g(0) — f)g(2)} + g(m)f(O) {g(1)f(2) — g(2)f(1)} = 0. 














it 

















ORTHOGONAL POLYNOMIALS AND HYPERGEOMETRIC SERIES 611 


Without loss of generality we may assume that f(1)g(0) — f(O)g(1) # 0, 
since if f(1)g(0) — f(O)g(1) = 0, the formula (15.7) coincides with (15.2), a 
case which has already been considered. 

We distinguish two possibilities. First suppose f(k)g(m) x(k, m) # 0. Then 
we may assume without loss of generality that f(0)g(0) x(0,0) = 0. And 
dividing (15.10) and (15.11) by g(0) and f(0) respectively it is clear that 
f(k) and g(k) satisfy the same trinomial linear recurrence relation with con- 
stant coefficients which are not all zero. 

Now suppose f(k) g(m) x(k, m) = 0. We still suppose f(1) g(0) — f(0) g(1) 
# 0. Now, if &; and m, are such that f(%:) and g(m,) are not zero, we have 
x(ki, m,) = 0 and hence 


f(Ri + w) g(mi) — f(Ri) g(mi + vw) = 0 for uw = 1,2,3,.... 
Hence 
g(m, + w) = g(m,) f(Ri)—' f(ki + wv) for uw = 1,2,3,.... 


From this it follows that g(m) satisfies any linear recurrence relation with 
constant coefficients that is satisfied by f(k). However, for m = my, it follows 
from (15.9) that f(%) satisfies a trinomial linear recurrence relation with 
constant coefficients. This completes the proof of the theorem. 


A check of this proof shows that (14.4) has only been used for » = 1 and 2. 

With the assumption that R is algebraically closed, it is now easy to derive 
from Theorem 15.1 the functions f, g, 4 and x satisfying (14.4). In fact, from 
the theory of linear difference equations, it follows (and it is easy to verify 
this directly), that any functions f and g satisfying (15.1) are given by 


(15.12) f(k) = wipi + wep, g(m) = wspt + wif? , 

whenever the equation x? + gx + r = 0 has distinct roots p; and p». If this 
equation has two coincident roots p we have 

(15.13) f(R) = (wik + w2)p*, g(m) = (wym + w)p". 


In both cases w, .. . , ws are arbitrary constants. 

Corresponding to (15.12) we find x(k, m) = wywspi"po™ — wowsps*pi" and 
h(u) = pr — pr. 

Corresponding to (15.13) we find 


x(k, m) = {wi(wsm + ws) — ws(wik + w2)} p**" and h(u) = wp. 
Recalling the role of the functions f, g, and 4, we obtain the following: 


THEOREM 15.2. Every function |s, k] such that the general hypergeometric 
series satisfies the functional equation (14.1) is given either by 


(15.14) [s, k] = w,(s)pi + we(s)p%, wilco) = — w2(c) = 1 
or by 
(15.15) [s, R] = {Rk ws(s) + we(s)}p*, wilco) = 1, we(c) = 0, 


where p, p, and p. are not zero, pi # pr. 








612 A. VAN DER SLUIS 


16. The nature of the more general series. Let us first consider the 
case in which [s, k] is given by (15.14). Put s’ = —we(s) w:(s)— if wi(s) # 0, 
put g = pep.—', and put [s’, k]’ = 1 — s’g*. Then, if w:(a), w1(b), w:(c) are all 
# 0, we find after some calculations that the series in (8.15) is equal to 
F({a’, k}’, [b’, Ll)’; [c’, m]’; xw1(a)wi(b)wi(c)—'p,**+*-"—"). Now consider the 
case that not all of w:(a@), w:(b), wi(c) are different from zero, for example, 
w,(a) = 0. Then the series in (8.15) is equal to ,F,([d’, 1)’; [c’, m]’; xwe(a) 
w1(b)wi(c)~'p2*p,'-"—'). Similar results are obtained in the other cases 
when one or more of w;(a), w:(6), w:(c) are zero. It follows that if [s, &] is given 
by (15.14) the general hypergeometric series always coincides with a (possibly 
confluent) Heine series in which the variable x may be multiplied by a constant 
factor. 

Let us now consider the case in which [s,k] is given by (15.15). Put 
s’ = we(s)a(s)~' if w:(s) # 0 and put [s’, k]” = s’ + k. If w:(@), w:(d) and 
w,(c) are different from zero, we find that the series in (8.15) is equal to 

F({a’, k]’’, [b’, 1)"; [c’, m]’’; x1 (a)w1(b)wi(c)— p***-*—"). 

Similarly, if one or more of the w;(a@), w:(0), w:(c) are zero, we get confluent 
series. Investigation of all possible cases finally gives 

THEOREM 16.1. Jf the function |s, k] is such that the series F(a, k}, (6, 1); 
[c, m];x) satisfies the functional equation (14.1), then this series is always a 
(possibly confluent) ordinary hypergeometric or Heine series in which the variable 
x may be multiplied by a constant factor. 

This theorem implies the justification of our assertion in §14, that the 


series which satisfy (14.1) instead of (8.8) do not give a more general class of 
orthogonal polynomials than those obtained in Part II. 


REFERENCES 


1. E. Frank, Orthogonality properties of C-fractions, Bull. Amer. Math. Soc., 55 (1949), 384- 


390. 
2. W. Hahn, Ueber Orthogonalpolynome die q-Differenzen-gleichungen gentigen, Math. Nachr., 
2 (1949), 4-34. 


3. H. L. Krall, Certain differential equations for Tchebycheff polynomials, Duke Math. J., 4 
(1938), 705-718. 

4. H. L. Krall and O. Frink, A mew class of orthogonal polynomials: the Bessel polynomials, 
Trans. Amer. Math. Soc., 65 (1949), 100-115. 

5. H. Padé, Récherches sur la convergence des développements en fractions continues d'une certaine 
catégorie des fonctions, Ann. Sci. Ecole Norm. Sup., 3, 24 (1907), 341-400. 

6. O. Perron, Die Lehre von den Kettenbriichen, 2. Aufl. (Leipzig, 1929). 

7. H. van Rossum, A theory of orthogonal polynomials based on the Padé-table (Diss. Utrecht, 
Assen 1953). 

, Systems of orthogonal and quasi-orthogonal polynomials connected with the Padé- 
table. 1, Kon. Ned. Akad. Wet., Proc. Section of Sciences, 58 Ser. A (1955), 517-525; 
II, idem, 526-534; III, idem, 675-682. 

9. G. Szegié, Orthogonal polynomials, A.M.S. Coll. Publ. XXIII (New York, 1939). 


8. 





University of New Brunswick 














ON THE INVERSION OF THE GAUSS 
TRANSFORMATION, II 


P. G. ROONEY 


1. Introduction. In an earlier paper (5) we studied the inversion theory 
of the Gauss transformation defined by 


1 ~ aitgng? 
(1.1) f(x) = G(o(x)) = af Ha-0* 4 (t) db. 
Operational methods indicated that 
exp (— D*) f(x) = $(x) 


and we showed that in certain circumstances this equation was true if 
exp (— D*) f(x) was interpreted as the sum of the series 


(1.2) = (—1)" f(x) /n! 


However, another possible interpretation of exp (— D*) f(x) arises from 
the well known formula 


and we shall show here that such an interpretation also leads to an inversion 
formula for the transformation. This is done in section two. 

Pollard (4) has developed an L, theory for inversion by the series (1.2), 
and in § 3 we shall develop a similar theory for our inversion. 


2. Convergence theory. The two theorems below give sets of conditions 
for inversion. We first prove a preliminary lemma. 


Lemma. If 


converges to the sum a, then 
n 


! 
lim >> —*___g, =a. 


tan mo (n—r)!n’ 


Proof. Let S, = ao + ai: +... + a,. Then 


Received September 17, 1957. This work was done while the author held a summer research 
associateship of the National Research Council of Canada. 


613 








614 P. G. ROONEY 


! a ! 
S oe = St nS, 


=o (n—r)In <, (n —r)in™ 
Hence, if ao = 1, a, = 0, n > 0, we have S, = 1, nm = 0,1,2,..., and 
. nir ‘ile 
<> (n—r)!n'™ 1. Also om aoe 0, 


so that the result follows from (2, § 3.1, Theorem 2). 


THEOREM 1. Jf ¢(t) € L(— 4,4), 6>0, |t)exp[— (xo — 4*] od € 
L(— ©, @) for some X > 3, and $(t) is of bounded variation in a neighbourhood 
of t = xo, then f(x), as defined by (1.1), exists for all x, and 


lim (1 - gr) f(x) |emze = 3{6(xot+) + o(x0—)}. 


no 


Proof. By (5, Theorem 1) f(x) exists for all x and (1.2) converges for x = x» 
to} {o(xot+) + o(xo—) }. But then by the lemma, witha, = (— 1)’f ©” (x»)/r!, 


im 2 (—1)" 2) = ${o(xo+) + o(%0—)}. 


Hence, 


= 3{o(xot+) + o(xo—)}. 


z=Z0 





THEOREM 2. If exp [—}(xo — #)?]¢(t) € L(— ©, ©), O(t) is of bounded 
variation in a neighbourhood of t = xo, and the series (1.2) converges for x = Xo, 
then f(x), as defined by (1.1), exists for all x, and 


im (1 - 2) 0 


Proof. By (5, Theorem 2), the series (1.2) is summable for x = x» in the 
Abel sense to 4{¢(xo+) + o(xo—)}. But, since (1.2) converges for x = Xo, 
and the Abel method is a regular method of summation, (1.2) converges for 
x = Xo to ${o(xot+) + o(xo—)}. Hence, by the lemma 
, 1 _ = n! f°” (xo) 

l (1 - ) x) | = | an (—1)°— 
im ~ f(x)| im >> ™ (-—1) 7 


aan gu (8 — FT) 


= 31 o(xot+) + o(xo—)}. 


r=F0 





= 4{o(xot+) + o(xo—)}. 














-* 


THE GAUSS TRANSFORMATION 


3. L, theory. 


THEOREM 3. If @ € L2(— ©, @), then f(x) as defined by (1.1) exists for all 
x, and 


L.i.m. (1 - Dy 2) = ¢(x). 


N+ n 


Proof. The existence of f(x) is clear. Let @ be the Fourier transform of ¢. 
Then since the Fourier transform of (47) exp [— }(x — #)*] is 


(2) exp(tyx — y’) 


we have, on applying the Parseval relation (6, Theorem 49 and 2.1.2), that 


l wae 
f(x) = ay ” ned ¥ &(y)dy. 
Since for » = 0,1,2,..., 
| ly" By) | = yl" e"|H(y)| € L(- @, @), 


it follows from (3, Corollary 39.2), that we may differentiate this integral as 
often as we like under the integral sign and obtain 


r —1 : - r —izy—y* 
| (x) = of y’ e —e @(y)dy 





and hence 


ee. ee ee 


n 


_1 ¢7(s(n a 
= ats f(E (*) n’'/® (y)dy 


1 © ._ 


Hence, if ¢, is the Fourier transform of S,, 
on(y) = (1 + 7‘) e &(y). 
Hence, from (6, Theorem 50 and 2.1.3), 


JF seca) - oe) fax = J Joalo) — 0) ay 


oo 2\n 2 
-f ((: +2) ce 1) @(y) |"dy. 


Now the integrand in this last integral tends to zero a.e. as n> ©. Also 
a short calculation shows that 





—y? . 
e —1) <l, 

















616 P. G. ROONEY 
and thus 
y* n . - ° 
((1 +4) ev 1) |S(y)|* < |@)|? € Li(—@, @). 


Hence, by the theorem of dominated convergence, 


lim J” |S.(x) — (@)|*éx = 0, 


that is, 

: D’ n 

l.i.m. (1 - r) f(x) = (x). 
n+co n 
REFERENCES 

1. A Erdélyi et al., Tables of Integral Transforms 1, (New York, 1954). 
2. G. H. Hardy, Divergent Series (Oxford, 1949). 
3. E. J. McShane, Integration (Princeton, 1944). 
4. Harry Pollard, Integral transforms, Duke Math. J., 13 (1946), 307-330. 
5. P. G. Rooney, On the inversion of the Gauss transformation, Can. J. Math., 9 (1957), 459-465. 
6. E. C. Titchmarsh, An Introduction to the Theory of Fourier Integrals (2nd ed.; Oxford, 1948). 


Unwersity of Toronto 














GENERALIZED INTEGRALS WITH RESPECT TO 
FUNCTIONS OF BOUNDED VARIATION 


R. L. JEFFERY 


Introduction. A considerable literature has grown up around the analysis 
of the structure of a function in terms of its derivative, and the structure of 
functions F(x) which are integrals of various kinds. Some of this relates to 
derivatives and integrals of F(x) with respect to functions of bounded variation 
w(x) (1-6) or, in the case of a paper by Ward (4), with respect to a function 
of generalized bounded variation in the restricted sense. While functions of 
bounded variation have at most a denumerable set of discontinuities, yet this 
set can be everywhere dense, and in the studies to which we refer a considera- 
tion of these discontinuities enters, sometimes in a complicated way. In the 
present paper results are obtained without reference to the values of F(x) or 
w(x) at the points of discontinuity of w. The results lead to a descriptive 
definition of a Lebesgue-Stieltjes integral of a function with respect to w and 
a descriptive definition of a generalized integral with respect to w. The latter 
involves functions F(x) which are generalized absolutely continuous relatively 
to w. 

Because w(x) can be written as the difference of two non-decreasing functions 
there is no loss of generality in taking w(x) to be non-decreasing. We shall 
consider a function w(x) on a closed interval [a, 6] with the understanding 
that w(x) = w(a), x <a, w(x) = w(b), x > 6. With w(x) given we shall 
denote by Ui a class of functions F(x) defined at the points of continuity of 
w(x) on [a, 6]. Furthermore, if € is the set over which w is continuous, then 
F(x) is continuous over € at points of €, and if x» is a point of discontinuity 
of w then F(x) tends to a limit as x tends to x»+ and to xo—, x € ©. These 
limits will be denoted by F(xo+), F(xo—). Also F(x) = F(a+) for x <a 
and F(x) = F(b—) for b > a. F(x) may, or may not, be defined at points of 
discontinuity of w. 


1. The w-measure of a set E on [a, 6]. Let (a’, b’) be an open interval on 
[a, 6). The w-measure of (a’, b’) is w(b’—) — w(a’+), and is denoted by 
|(a’, b’)|... Let E be any set on [a, 6]. Let a, a2, . . . be a set of non-overlapping 
open intervals containing EZ. The outer w-measure of the set E is the infimum 
of Z\a;|. for all such sets of open intervals. This outer measure is denoted by 
\E|.°. Let E be the complement of the set E. If for « > 0 there exists a set 
of non-overlapping open intervals a = a; + a2 +..., a2 E and a similar 
set 8 > E for which |a$!.° < « then the set E is said to be w-measurable. 
The w-measure of E, denoted by |£)., is equal to |E),°. 


Received October 21, 1957. 


617 








618 R. L. JEFFERY 


LemMA 1. If a =a, +a2+... is a set of non-overlapping open intervals 
1, @2,... then a is w-measurable. 


Because w(x) is BV it follows that Z\a,|., converges. Hence for « > 0 there 
exists an integer m» such that for m > mo, 


cs) 


po leegle < «€ 


n+1 


Fix n> No. Let az= (ai, b;), 1 < n. Take a; bf with a; < a; < b, < b; 
and such that 


DL, {l(uai)|e + |(bi dla} <« 


Let 8 be the open set complementary to the finite set of closed intervals. 
[a;’, by], ..., [an’, b,’]. Then 8 D> E and 


lap\, = lors + > {|(@s, 0%) lu + | (0%, bs) |u} < 2e. 


n+1 
Because « is arbitrary it follows that a satisfies the definition of measurability. 

LemMA 2. Let E be any set on |a, b|. Let each point x € E be the left hand 
end point of a set of intervals (x,x + h,) for which h, ~0 as i— @. Let § 
denote the set of intervals thus associated with the set E. Let « > 0 be given. 


There exists a finite non-overlapping set A, of the intervals of § for which 


Dd lade < |Elet+e, >> [AB > lz —«. 
i=1 


t= 1 
Let » > 0 be given. Put E in a set of open intervals a:,a2,... in such a 
way that Zla,|, < |E\.° + 7. Now let a be a finite set of the intervals 
G1, @2,...,@ =a, +a2+...+ a, where n is sufficiently great to insure 
that |aE),° > |E).° — n. Let a; = (a;, b,). Let 7’ be so fixed that if 5, is 
on (a; b;), ag < by <b; and b; — by < 7’ then 


(1) > |E(bi by) |. < 9. 


Let E; be the points of Ea which are such that if x is in E; then there is an 
interval (x, x + h) of § with h > 6 and x, x + h on the same interval of the 
set a. If 52 < 5; then 

Be, > Ea. 
Hence |E;|,.° — |Ea|..° as 6 — 0. Fix 6 < 7’ and sufficiently near zero to insure 
that 
(2) |Es\o > |EalS — 9 > |E\o — Qn. 
Now consider the interval a; = (a;, 6;). There is a point x,’ > a; which is 
(a) the first point of E, to the right of a; or (b) the infimum of points of E; on 











GENERALIZED INTEGRALS 619 


(a,, 6;). For case (a) let x,’ = x; and let (x;, x; + 4;) be an interval of § 
on (@;, 5;) with A, > 6. For case (b) take x; a point of Z; and such that 


(3) \E (xi, x1) 0 < a1 
where ¢, is the first member of a sequence of positive numbers ¢;, 2, .. . for 


which Le; < n/n, and take (x;, x; + h;) an interval of § with hk; > 4. 

In the foregoing replace a, by x; + /; and arrive at x,’ > x; + A, satisfying 
(a) or (b) and an interval (x2, x2 + he) of F on (a;, b:) corresponding to 
(x1, X1 + 41) with he > 6. If case (b) holds choose x, so that 


(4) |\E(x3, x2) |o < €2. 

Since each fA, > 6, and 6 < yn’ this procedure can be continued to get a finite 
non-overlapping set of intervals [x,, x; + 4/1], [x2, x2 + he], ... , (Xm, Xm + Am) 
of the set § for which no points of aE; are to the right of x, + h, or 

(5) b; _ (Xm + hm) < n’. 

Also the intervals (x;, x; + h,),7 = 1,2,...,m, have been so chosen that 
(6) DL Etude < Dia<s. 


This procedure can be repeated for each of the remaining intervals as, as, . . . 
a, of the set a. 

From (1), (5), (6) and the relations similar to (5) and (6) for the intervals 
@,...,@, it follows that the total set A; = (x,;, x; + h,) obtained by-this 
process are on a, are non-overlapping and 


p> |E.A;| > |aEs| — 2n. 


This, with (2) and the fact that E; is on a and A,E; C A,E, gives 2A,E>|E),° 
— 4n. Because A, is on a and because 7 is arbitrary the lemma follows. 


Definition 1. A function F(x) defined on [a, 6] and in class U is absolutely 
continuous relative to w, AC — w if for « > 0 there exists 6 > 0 such that for 
any set of non-overlapping intervals (x,,x,/) on [a,b] with Zfa(x,’ + ) 
— a(x, —)} <6 the relation =| F(x,’ +) — F(x,—)| < « is satisfied. 


2. The derivatives of F(x) with respect to w. Let F(x) be a function in 
class ll. Define the function y(x, h) by the relation 


F(x +h) — F(x —) 
[au+m —w(x-—)’ 

F(x + h) — F(x +) : a 
wn +B) le +) * h < 0, w(x + h) — w(x +) #0, 
0, w(x + h) — w(x +) = 0, 


for points of continuity x + A of w. If ¥(x, h) tends to a limit as h — 0 this 
limit is the derivative of F(x) with respect to w(x), DF. The upper and 
lower limits of this ratio are the upper and lower derived numbers of F(x) 


h > 0, w(x + h) — w(x —) € 0, 
¥(x,h) = 














620 R. L. JEFFERY 


with respect to w. It is to be noted that D,F exists at points of discontinuity 
of w even when F(x) is not defined at such a point. 


Definition 2. Let f(x) be defined on an w-measurable set E on [a, 5]. Let e, 
be the part of EZ for which f(x) < a. If e, is w-measurable for every real number 
a then f is w-measurable on E. 


Definition 3. Let f(x) be w-measurable on an w-measurable set E. Let 
(1,1, /;) be a subdivision of the range of f on E. Let e, be the points of Z 
for which 1,1 <f <1; If Zlisled. tends to a limit as the supremum of 
Ll, — ly1—0 then this limit is the Lebesgue-Stieltjes integral of f(x) over the 
set E. 


If x is a point on [a, 6] and f(x) is w-measurable on [a, 5] then 


Zz 
F(x) = J sao 
is absolutely continuous relatively to w. Also F(x) is in class U and DF = f 
except for a set of w-measure zero. This definition of a Lebesgue-Stieltjes 
integral is the usual one and the stated properties may be proved in the 
usual way. 


3. Properties of functions in class ll. 


Definition 4. Let F(x) be a function in class U. The set [x, F(x)] is the 
union of the graphs [x, F(x+)] and |x, F(x—)]. For any interval of con- 
tinuity [x, F(x)] is thus the graph of F(x) in the usual sense. 


THEOREM 1. Let F(x) and G(x) be two functions in class U. If F and G are 
AC — w and if DF = DG, except at a set of w-measure zero, then the sets 
[x, F(x)], [x, G(x)] are identical or one is a translation parallel to the y-axis 
of the other. 


Set H(x) = F(x) — G(x). Then H is AC — w and DH = 0 except for a 
set of w-measure zero. At a point of discontinuity x» of w(x), H(x»—) = 
H(x»+). For otherwise 


lim ——— #0 

n40 (X90 + h) — w(xo +) 
and it follows that D,.H # 0 on a set of w-measure greater than zero, which 
is a contradiction. If at points of discontinuity of w we set H(x) = H(x+) 
= H(x—) then H(x) is continuous on [a, 6]. We now prove that H(x) is 
constant on [a, 8]. 

Let E be the set on [a, 6] at which D,.H = 0. Then |E), = |[a, d]}.. If x, 
is a point of E and « > 0 is given there is a sequence of intervals [x, x + h,|, 
h; > 0, h;-—0asi— © such that 

H(x + h,) — H(x —) 


(1) w(x + hy) — w(x —) aia 











—_— - * bw 











GENERALIZED INTEGRALS 621 


By Lemma 2 there is a finite set A, of these intervals associated with the 
set E such that 

(2) [Z|Ade — |[a, b]lw| < «, ZiAse < «, 

where A, is the finite set of intervals complementary to the set A,. It follows 
from the second member of (2) and the fact that H(x) is AC — w that if 9 
is given then « can be so fixed that if [x,, x,/] are the intervals of A, then 


2\|H (x) — H(x,)| < 2. 
It follows from (1) that for the intervals A, 
L\H(x,) — H(x,)| < «M 


where M depends on the function of bounded variation w(x). It then follows 
that |H(b) — H(a)| < «eM + ». Because ¢ and 7 are arbitrary, « fixed after 
n, it follows that F(b) = F(a). If a < x < 5 it can be shown in the same way 
that H(x) = H(a). Hence H(x) is constant on [a, 8]. 

It now follows that F(x) — G(x) = C, a constant, at points where both 
functions are continuous, that is, at the points of continuity of w. Further- 
more, at points xo of discontinuity of w, 

lim [F(x) — G(x)] = C, 
z+Zo+0 
x a point of continuity of w, from which it follows that at points of discontinuity 
of w the jumps, if any, of F and G are the same and in the same direction. It 
then follows that the sets [x, F(x)], [x, G(x)] are identical or one is a translation 
parallel to the y-axis of the other. 


THEOREM 2. If F(x) is in class U and is AC — w and if DF = f(x), except 
for a set of w-measure zero, then if a and x are points of continuity of w 


F(x) — F(a) = f se) dw. 
Let 
G(x) = Ji@ dw. 


Then G(x) is in class Ul, is AC — w, and D.G = f(x) except for a set of x 
of w-measure zero. Hence G and F satisfy the conditions of Theorem 1, and, 
except at points of discontinuity of w 


F(x) — G(x) = C. 
Since G(a) = 0 it follows that C = F(a) and 
F(x) — F(a) = G(x) = J fede. 


This theorem gives point to the following definition: 








622 R. L. JEFFERY 


Definition 5. Let f(x) be defined on [a, 6] and be measurable relative 
to w. Let F(x) be a function in class Ul which is AC — w and such that 
D.F = f(x) except possibly for a set of w-measure zero. If the series 
Zif(xs){w(xit+) — w(x,—)}|, where x;,,i = 1,2,..., are the points of discon- 
tinuity of w, converges and if F(x,+) — F(x,—) = f(x, {w(xi+) — w(x,—)} 
then F(x) is the Lebesgue-Stieltjes integral of f(x): 


F(x) — F(a) = { fi) da(. 


Definition 6. A function F(x) is generalized absolutely continuous with respect 
to w, ACG — w, over [a, b] if this interval is the sum of a denumerable sequence 
of closed sets over each of which F(x) is AC — w. 


THEOREM 3. Let F(x) and G(x) be two functions in class U each of which 
is ACG — w on |a ,b|, and such that D.F = D.F except for a set of w-measure 
zero. Then the sets |x, F(x)], |x, G(x)] are identical or one is a translation parallel 
to the y-axis of the other. 


As in Theorem 1, let H(x) = F(x) — G(x) at points of continuity of w(x). 
Then DH = 0 except for set of w-measure zero, and it follows as before that 
H(xo—) = H(xo+) at points of discontinuity of w. Hence if H(x») is equal 
to this common value then H(x) is continuous on [a, b]. We now have H(x) 
continuous and ACG — won [a, 6] and D.H = 0 except for a set of w-measure 
zero. We show that H(x) is constant on [a, 6]. 

Because F and G are both ACG — w on [a, 5] it follows that 


[a,b] = >> Ei, [a, 5] = >> Ej, where Ej, Ej are closed, 


where F is AC — w over each E;' and G is AC — w over each E,*. Hence 
H(x) is AC —w over each of the sets E,' E7,i,j7 = 1,2,... This is a 
denumerable sequence ©, 2, ... of closed sets which cover [a,b]. Let E 
be the set on [a, 6] which is such that if x € E then in every interval w with x 
as an interior point H(x) fails to be constant. The set E is closed. It then 
follows from Baire’s theorem that there is an integer m and an interval [/, m] 
such that E[/, m] is not empty, E[{/, m] = ©,[l, m] = e. The set e is closed 
and H is AC — w over e. Hence, if ¢€ is given, there exists 6 such that if 
(x, x,') is a set of non-overlapping intervals on [l,m] with x,, x,’ points of ¢ 
and with =/(x;,x,/)|. <6 then 2|H(x,/) — H(x,)| < «. Now let (x,,x,/) be 
any set of non-overlapping intervals on [/, m] with =|(x,, x,)|. < 6. If there 
are points of e on [x,,x,'] let Z, be the first point of e to the right of x,, ; = x; 
if x, € e. Let Z, be the first point of e to the left of x/,Z/ = x, if x/ € «. 
Because H(x) is continuous, and constant on intervals of [/, m] which are 
complementary to the set e = &,[/, m], it follows that H(x) is constant on 
[x,, Z,] and on [Z,’, x,']. We now have H(z,) — H(x,) = H(x,/) — H(z#/) =0 
and 





“a or 





GENERALIZED INTEGRALS 623 


> |H(x;) — H(x)| < ¥ |, — H(x)| + 
> |A(«) — H(e)| + & JA) — H&)| 
< > |A(x) — H(x;)| <« 


The last relation follows because #,,#,/ are points of ©, and | (#,, %;')\. 
< =|(x,, x,/)|4 < 6. Then, because (x,, x,/) is any set of intervals on [/, m] it 
follows that H(x) is AC — w on [/ ,m]. Because D.H = 0 except for a set of 
w-measure zero it now follows from Theorem 1 that H(x) is constant on 
(1, m]. Consequently, there are no points of E on [/, m]. But this contradicts 
the fact that E is not empty. Hence the set E is empty. It now follows that 
every point of [a, d] is interior to an interval on which H(x) is constant. By 
the Heine-Borel theorem there is a finite set of intervals covering [a, 6] on 
each of which H(x) is constant. Then, because H(x) is continuous it easily 
follows that H(x) is constant on [a,b]. The proof of Theorem 3 may now be 
completed as in Theorem 1. 


THEOREM 3. Let F(x) be a function in class \\ which is ACG — w on {a, d}. 
Let f(x) be a w-measurable function on (a, b) and let D,.F = f except for at most 
a set of w-measure zero. Then F(b+-) — F(a—) can be determined in at most a 
denumerable set of operations. 


LEMMA 2. Let E be a closed set on |a, b|. If F(x) satisfies the conditions of 
Theorem 3 there is an interval {l, m] on [a, 6] such that E|l, m] is not empty, such 
that DF is Lebesgue-Stieltjes integrable with respect to w over E\l,m|, and such 
that if |a;, 8;| are the intervals on |l, m| contiguous to E|l, m), then =| F(8;+-) 
— F(a;—)| converges. 


Let [a,b] = ©, + ©. +... where each &, is closed and F is AC —w 
over +,. There is then an integer m and an interval [/, m] such that E[/, m] 
= ©,[/, m] = e. We now turn our attention to the function F(x) on [/, m). 
F is AC — w over e. It then easily follows that there exists a positive number 
M such that for any set of non-overlapping intervals (x,, x,’) on |a, 6] with 
x;, x, points of e the relation | F(x,/+) — F(x,—)| < M holds. We use this 
fact in showing that |D,F| = |f| is Lebesgue-Stieltjes integrable with respect 
to w over e .Let (/;_1,1,;), 1 = 1,2,..., be a subdivision of the range [0, ©]. 
Let e, = E(lnn <|fl| <hl,x€ &),i>1, e = Ell) < f < h). Suppose that 
Z1,-1/e;6 diverges. Fix m so that 


} ¥ Lyslesle > 2M. 


i=1 
If x € e, there is a sequence of intervals (x, x + h,), hy > 0, 4 0 such that 


Fe + hy) — Fle—)| 


(3) la(x + Ay) — a(x—)| 














624 R. L. JEFFERY 


Let €o, €1, €2, . . . be a sequence of positive numbers with «, — 0. By Lemma 2 
there is a finite set A,° of non-overlapping intervals of the set associated with 
the set ¢) by means of (3) for which 


(4) } Arle > leole = @&, > SoA, le < €. 


Because of the second relation of (4) there exists a finite set A,! of the intervals 
associated with e; by means of (3) such that 


> [Ac |. > lerle — €0 — €1, os ja(A, + Ar’) le < €o + 41, 


and 2A,' does not overlap 2A,°. If this process is continued there is obtained 


a set of intervals A,*, which do not overlap the set [A,°+ TAA +...+ 2A, 
such that 

i [Al > lezle =e OG 220 * & 
and 


> mae + A +...+ AL <atat...+¢ 


Also, because of (3), it is true that if (x,, x,/) are the intervals of the set A;* 
then 


F(xi+) — F(i-) 





> hes 

a(x +) — a(x,—) ae 

and 
> {F(xi+) — F(xi-—)} > bea(lezle — 0 — er — ... — &). 
Combining all the sets A;* into a single set A, = (x,, x,;/) and summing over 
this set we get 
> |F(x’+) — F(x;-—)| > _ Ie-1\€x\o — nloeg — (nm — 1)hyey — .. . — Ips. 
k=1 

The first sum on the right is greater than 2M. The numbers ¢o, €:,..., he 
are independent of m, and independent of the numbers Jp, 1;,.. . , /,-1. Hence 


the left side is not less than 2M. But x,;, x,’ are points of e which compels’ 
the left side to be less than M. Thus there is a contradiction and we may 
conclude that 2/;,\e;|. converges. It then follows that |D.F| is Lebesgue- 
Stieltjes integrable with respect to w over e. 

Let (a;, 8,) be the intervals on [/,m] complementary to e. Let e > 0 be 
given. Fix mo with 


2) 


> |F(6:+) — Flai—)| <e. 


no+1 


Take (J,1, /;) a subdivision of (— @, ~). Let e;, e;’ be respectively the parts 
of e for which /,_, < DoF < 1, lin < DoF < 1;. Let the subdivision (J;_:, 1,) 
and the integer m,; be such that 


. Le-slesle» } Vsled |e 


—ni —ni 











GENERALIZED INTEGRALS 625 


differ from the Lebesgue-Stieltjes interval of DF over e by not more than 
«. By working as before with the sets e, we can get a set of intervals A, with 
end points x,, x,’ belonging to e and such that the intervals A; do not overlap 
the intervals (a;, 8;),1 = 1,2,..., 0, such that 


y F(x,'+) = F(x,—) > > Leslesle —e€e> fra, —_ 2e, 
tol = e 


and such that the finite set of intervals A, complementary to the set A, and 
the set (a,, 8;),i = 1,2,..., mo, satisfy 2|A,|, < 6, from which it follows that 
=| F(x,/) — F(x,;)| < «. Hence 

no 


F(m+) — F(l—) = » {F(B:) — Fla} + DY {F(xi’+) — F(xi—)} 
+ > {F(x/) — F(x;)} 


> > {F(8,) — Fla;)} + {Dra — 3e. 


1 


If a similar procedure is used with the sets e,/ it may be shown that 


oo 


F(m+) — Fl-) < 5 (FB) — Fla—)} + f DuFa, + 36 
1 e 
Because ¢ is arbitrary we conclude that 
F(m+) — F(-) = f D.Fdo + X (F(6-) — Flact)}. 


We can now state that if E is any closed set on [a, 5] there is an interval 
[l,m] containing points of E such that D,F is summable over E[{/, m] and 
such that if (a,,8,) are the intervals on [l,m] complementary to the set 


E{l, m] then 
F(m+) — F—-) = J. PoP + ¥ (F(6-+) — Fla—)}. 


Let E,; be the points of non-summability of D.F over {a, 6], (a:, 8;) the 
intervals complementary to £;. If (a’, 8’) is an interval such that a; < a’ 
< 6’ < 8, then DF is Lebesgue-Stieltjes integrable over [a’, 6’] and 


F(p’+) — F(e’—) = f D.Fdw. 


Because of the continuity properties of F it follows that as a’ — a,, 8’ — 8, 


F(6’+) — Fla'—) — F(6i—) — Fart), 


and 


F(Bi+) — Flai—) = F(Bi—) — Flat) 


+ f D.Fdw — f D.Fdw. 
By ay 














626 R. L. JEFFERY 


Thus F(8,+) — F(a;—) is determined for all intervals (a;, 8;) contiguous to 
the set E,. Now let E, be the points of EZ; which are such that if x € Ey 
there is no interval [/, m] containing x with D..F summable over E£,{/, m] and 
={ F(6:t+) — Fla:—)} converging where (a;, 8,;) are the intervals on [/, m] 
contiguous to the set £,[/, m]. The set E» is closed and, by Lemma 2, non- 
dense on £;. If (a;, 8;) are the intervals complementary to E; the procedure 
used for the intervals complementary to E; can now be used to obtain 
F(8.:+) — F(a:—) for these intervals (a;,8;) complementary to E:. This 
process can be continued by transfinite inductions to arrive at F(+) — F(a—) 
in a denumerable number of steps. 
A consideration of Theorem 3 leads to the following definition. 


Definition 7. Let w be a non-decreasing function on [a, 6] and let f(x) be 
defined on [a, 6] and be measurable relatively to w. If there exists a function 
F(x) in class U which is ACG — w on [a, 6] and is such that D,F = f except 
for a set of w-measure zero, and for which the relations of Definition 5 are 
satisfied at the points of discontinuity of w, then F(x) is an indefinite Lebesgue- 
Stieltjes integral of f with respect to w. 

The descriptive definition of an integral with respect to a non-decreasing 
function w given here appears to be equivalent to the constructive definition 
given in (3, p. 666). This requires proof. Another problem for investigation 
is that of extending the methods of the present paper to the case in which 
the base function w is VBG. 


REFERENCES 


1. C. Choquet, Applications des propriétés descriptives de la fonction contingent @ la théorie 
de variable réelle et @ géometrie différentielle des variétés Cartésiennes, J. Math. pures 
et appl., 26 (1947), 115-226. 

2. C. A. Hayes, Jr. and C. Y. Pauc, Full individual and class differentiation theorems in their 
relations to halo and Vitali properties, Can. J. Math., 7 (1955), 221-274. 

3. R.L. Jeffery, Non-absolutely convergent integrals with respect to functions of bounded variation, 
Trans. Amer. Math. Soc., 34 (1932), 645-675. 

4. H. Lebesgue, Legons sur l'intégration (Paris, 1928). 

5. J. Radon, Theorie und Anwendungen der absolut additiven Mengenfunktionen, Wiener 
Sitzungberichte, 122 (Abt. IIA) (1913), 1295-1438. 

6. A. J. Ward, The Perron-Stieltjes integral, Math. Zeit., 41 (1936), 578-604. 


Queen's University, 
Kingston 














ON THE DENJOY CONJECTURE 
JAMES A. JENKINS 


1. In recent years many of the properties of regular functions have been 
shown to extend to quasiconformal mappings. (The latter term is here under- 
stood in the sense defined in (5).) This is particularly true of those results 
which can be proved by use of the method of the extremal metric. It is rather 
strange, then, that a result which constitutes one of the first notable applica- 
tions of this method has not been so extended (at least to the author's know- 
ledge). Here we refer to the proof of the Denjoy conjecture given by Ahlfors 
(1). The reason for this may be that Ahlfors’ proof uses in an essential manner 
the principle of majoration for harmonic functions and this part of the proof 
does not extend readily to quasiconformal mappings. Not long afterwards 
another proof was published by Beurling (2). The same situation appears to 
hold for this proof although in some ways it is probably closest in spirit to 
the one given below. Later proofs are either rather technical modifications of 
Ahlfors’ proof or follow that of Carleman (3) and so contain features rendering 
them unsuitable for generalization to quasiconformal mappings. 

In this paper we give the extension of the Denjoy conjecture to quasi- 
conformal mappings. Our proof uses the principle of harmonic majoration 
only in proving one self-contained lemma and otherwise operates exclusively 
with the method of the extremal metric. 


2. A principal role will be played by a geometrical configuration which we 
will call a triad. This consists of a simply-connected domain of hyperbolic 
type, say D, an open boundary arc of D (in the natural sense of boundary 
correspondence), say y, and a distinguished interior point P of D. We denote 
in the usual way by w(P, y, D) the harmonic measure of y taken at P with 
respect to the domain D. Also we will use the notation y* for the closed 
boundary arc of D complementary to y. We denote by m(P, y, D) the module 
(4, 6) of the class of (locally rectifiable) open arcs lying in D—{P}, running 
from y back to y and separating P from y*. It is well known that there is a 
monotone increasing function from the interval [0,1) to the half-infinite 
interval [0, ~) which relates w(P, y, D) to m(P,y,D). For real numbers a 
and b,a < b, we denote by S(a, 6) the strip in the (u, v)-plane defined by 


a<u<b, 
by g(a) its boundary arc u = a and by g(6) its boundary arc u = 3. 


Received October 9, 1957. Research supported in part by the Mathematics Section, National 
Science Foundation through the University of Notre Dame and partially supported by 
the Office of Ordnance Research, U. S. Army under contract No. DA-36-034-OR D-2453 


627 








628 JAMES A. JENKINS 


LEMMA 1. Consider a triad consisting of a Jordan domain D in the z-plane, 
an open boundary arc y of D and an interior point P of D. Let f be a mapping 
quasiconformal on D, of maximal dilation (5) K, which satisfies 


lf()|<A, s€y¥ 

lf@|<B, s€ 7 
with 0 < A < B. Then 
(1) m(P, 7, D) < Km/(log| f(P)|, g(log A), S(logA, log B)) 
where the logarithms in each case denote the principal value. 


There exists a triad consisting of a Jordan domain D’ in the 2’-plane, a 
boundary arc 7’ of D’ and an interior point P’ of D’ related to the given triad 
by a univalent mapping z = ¢(z’), quasiconformal on D’ and of maximal 
dilation K such that the function 


g(z’) = f(o(z’)) 
is regular on D’. Then we have by the two constants theorem 
log | g(P’) | < (log A) w(P’, vy’, D’) + (log B) (1 — w(P’, y’, D’)) 


that is 


rat pry ¢ log B — logig(P’)| 
M70) < log B — log A 


< w(log |f(P)|, g(log A), S(log A, log B)). 
In view of the monotonic correspondence between the harmonic measures 
and the modules of triads we have 
m(P’, y', D’) < m(log | f(P) | , g (log A), S (logA, log B)). 
On the other hand we have evidently 
m(P,y,D) < Km(P’, y', D’) 


from which inequality (1) follows at once. 

While we have used harmonic majoration in the proof of this result its 
statement involves only modules. It would be very interesting to have a 
proof of Lemma | purely by the method of the extremal metric. 


Lemma 2. If |< < L are real numbers then for l, \ fixed and L tending to 
infinity 
1 
m(r, g(l), Stl, L)) < = log L + O(1). 


Let I be the class of (locally rectifiable) open arcs lying in S(l, L) — {A}, 
running from g(/) back to g(l) and separating \ from g(L). To provide an 
admissible metric in S(/, L) for the module problem for I we define first the 
square Z(I, L) 











ON THE DENJOY CONJECTURE 629 


Leu<L, —-— <<. 


Let E(l,, L) be the half of the open elliptical disc to the right of the line 
u = 1 which is bounded by the ellipse with centre at /, foci at \, 2/ — \ and 
passing through the point (Z,4}(Z —/J)). For the semi-axis major a of this 
ellipse we have 


(a? — (A — D*)(L — D*? + 2(L — D)*a? = at — a*(A — I)? 
so that for LZ large enough 
a< SL. 


Let now p*(w)|dw| be the extremal metric in the module problem for the 
family of curves Z lying in E(l,\, L), running from the minor axis back to 
the minor axis and separating \ from the elliptical boundary arc. It is well 
known that this family has module 


=) - 1 
m(z) = - log P 
where P is the larger root of 
P+ P! = 2(A — J)—"'a. 
Evidently 
m(Zz)< * tog [2(, — 1)~"a] < + tog L + O(1). 


Now we regard the metric p(w)|dw! defined by 


p(w) = 0, w€ S(l,L) — Zl, L) 
= max (p*(w), (L — /)~'), we€ Z(l,L). 


Considering [ as composed of the three classes of curves which meet both, 
just one, or neither of the horizontal sides of 2(/, L) and recalling the sym- 
metry of the metric p(w)|dw) in the real axis we see that this metric is admissible 
in the module problem for I. In it the area of S(/, L) is less than m(Z) + 1. 
Thus 


m(r, g(l), S(I, L)) < * tog L+0(1) 
as stated. 


3. The concepts of an integral quasiconformal mapping and its maximum 
modulus are evident generalizations of similar ones for regular functions. So 
are the concept of an asymptotic value of such a function and the manner 
of counting distinct asymptotic values. We are now ready to state our principal 
result. 











630 JAMES A. JENKINS 


THEOREM. If the integral quasiconformal mapping f(z) of maximal dilation 
K has n distinct asymptotic values and M(r) denotes the maximum modulus 
of f(z) for \z| = r, then 

lim M(r)r*** > 0. 


R00 


The standard reduction of the problem enables us to consider the following 
situation. There exist a circle |z| = Ro(R» > 0) and m non-intersecting open 
arcs A,, 7 = 1,...,m, running in |z| > Ry from this circle to the point at 
infinity and dividing |z| > Ro into m domains D,, j7 = 1,...,. Further, there 


is a positive constant a such that on |z| = Ry and on the A,,j=1,...,m 
we have 


f(z)| <a. 
Finally, in each domain D,,j = 1,...,m, there is a point P,,j = 1,...,m, 
of modulus 8, for which 
| f(P») > a. 
Let 
B= max 8,. 
jml,....9 


Then for R > 8 there is on the circle |z| = Racross cut B,(R) in D, separating 
the boundary arc of D, on |z| = Ry and the point P, from the boundary 
point of D, at infinity. Let the component of D, — B,(R) containing P, be 
denoted by D,(R). Let the open boundary arc of D,;(R) complementary to the 
closure of B,(R) be denoted by C,(X). Let p,(z, R)|\dz| be the extremal metric 
in the module problem defining m(P,, C;(R), D,(R)),j7 = 1,...,m. 

We denote by K(8, R) the circular ring 


B < |2| < R. 
In K(8, R) we define the metric p(z, R)\dz| by 
p(z, R) = p,(z,R), 2€ K(8,R)OD,(R),j =1,...,0, 
p(z, R) = 0, elsewhere in K(@, R). 


Then if the rectifiable Jordan curve c separates the boundary components 
of K(8, R) we see at once 


J. p(z, R)\ds| >n 
so that 


(2) 2. 


R n 
=—m' log < >) m(P;, C,(R), D,(R)). 
2a 8 j=l 
By Lemma I, since 


lf(2z)| <a on C,(R), 
| f(2)| < M(R) on B,(R), 


we have 





ON THE DENJOY CONJECTURE 


(3) L, m(P;, CR), D(R)) 


<K Dy m (log |f(P;)|, g(log x), S(log a, log M(R))). 


Moreover, by Lemma 2, 


(4) 2d. m/(log |f(P;)|, g(log a), S(log a, log M(R))) < * log log M(R)+0(1). 
Combining (2), (3), and (4) we find 


Jail n 
2,” log R+ O0(1) <K log log M(R) 


so that 
log M(R) > SR"™* 


for a suitable positive constant S and R large enough. This proves our theorem. 

Our resullt can be restated in a perhaps more familiar but weaker form 
in saying that an integral quasiconformal mapping of order p (the extension 
of this notion from integral functions is also immediate) which has maximal 
dilation K has at most [2Kp] asymptotic values. An evident modification 
of the corresponding example for integral functions shows this result to be 
best possible. 


REFERENCES 


1. Lars V. Ahlfors, Untersuchungen sur Theorie der konformen Abbildung und der ganzen 
Funktionen, Acta Societatis Scientiarum Fennicae, Nova Series A, Opera Phys.-math., 
1, (1930), no. 9, 1-40. 

. A. Beurling, Etudes sur un probléme de majoration, thesis (Upsala, 1933). 

3. T. Carleman, Sur une inégalité différentielle dans la théorie des fonctions analytiques, C. R. 
Acad. Sci. Paris, 196 (1933), 995-997. 

4. James A. Jenkins, Symmetrization results for some conformal invariants, Amer. J. Math., 

75 (1953), 510-522. 

, On quasiconformal mappings, J. Rat. Mech. and Anal., 5 (1956), 343-352. 

, Univalent functions and conformal mapping, No. 18, Ergebnisse der Mathematik, 

Springer-Verlag (to appear). 


nN 








University of Notre Dame 
and 
Institute of Advanced Study 

















ON BABINET’S PRINCIPLE 
R. C. MacCAMY 


1. Introduction. In acoustic and electromagnetic theory much use is 
made of what is called the Principle of Babinet (1). The principle states that 
the problems of diffraction by an aperture, S, in a plane screen and by a plane 
obstacle occupying the position S, are equivalent. It is the intent of this paper 
to indicate how this notion of equivalent boundary problems, for partial 
differential equations, can be extended to other situations. One may alter the 
underlying differential equation or the boundary conditions but the word 
plane is essential, that is, data must always be given on a plane. 

It was observed by Rubin (4) that a kind of Babinet Principle holds in 
the diffraction of water waves by a “‘dock.’’ The principle we propose is 
somewhat different in the water wave problem and of wider applicability. Its 
emphasis is on the integral equation formulation of boundary problems and 
its principal aim is to simplify, as much as possible, these equations. 

In § 2 we shall prove the classical Babinet principle in slightly different 
form in order to introduce the method. In §§ 3 and 4, we state and prove 
the extended principle and finally in § 5, we give two simple examples. 


2. The Classical Babinet Principle. In order to have a point of de- 
parture, we state and prove the classical principle in a slightly different 
form for two dimensions. 


THEOREM 1. Consider the following two problems: u(x, y), v(x, y) solutions 
of 
(2.1) Wz, + wy, + w = 0, y <0, 


continuous in y < 0, satisfying a radiation condition 


. ow ; 

ion vi  ~ hi iw) = 0 
uniformly in the polar angle, r? = x* + y*, and with 
(I) u(x,0) = 0 on |x| >a, u,(x,0) = g(x) on |x| <a, 
(II) »,(x,0) = 0 on |x| >a, v(x,0) = h(x) on |x| <a; 


g(x) and h(x) given functions which we assume analytic on |x| < a. Then the 
solution of problem (11) yields simultaneously the solution of problem (1). 


Received September 19, 1957. This work was sponsored in part by the Office of Naval 
Research contract N onr-222(25), University of California at Berkeley and in part by Air 
Force Office of Scientific Research contract AF-18(600)-1138, Carnegie Institute of Tech- 
nology. 


632 





aa Ae 











ON BABINET’S PRINCIPLE 633 


Assuming we can solve (II), let v, v',v* be solutions for h(x) = ho(x), cos x 
and sin x respectively, ho(x) any (fixed) particular solution of the differential 
equation, 


d’h 
(2.2) ae + * = — ete), —a<x<+a. 


Now set 
u(x, y) = vy (x, y) + Av} (x, y) + Boj (x, y). 


A and B any constants. Then clearly u(x,0) = 0 on |x| > a. Now g(x) and 
hence fo(x) are analytic on |x| <a and thus v‘(x, y) can be continued 
analytically across y = 0 |x| < a as solutions of (2.1). Thus we can write 
for y = 0, |x| <a, 


vy, (x,0) = — v32(x, 0) — v‘(x, 0) 
and hence by (2.2) 
2 
u,(x,0) = — (S 1) (ho) + Acosx + Bsinx) = g(x). 


There remains the question of determining A and B. It is well known that 
solutions of (II), continuous at (+ a, 0) have the form, 
v(x, 0) = c*[(x Fa)? + y’}! kes 


The dots indicate terms with continuous first derivatives at (+ a, 0). Since 
u derives from v by differentiation, we have accordingly, 


u(x,0) = (Co* + C,*A + C*B)[(x F a)? + y’J +... 
near x = +a. But u(x, y) is to be continuous in y < 0, hence we make 
Ce al C,*A + C,*B = 0. 


These equations determine A and B unless the homogeneous system, 
C,+A + C.+B = 0 should have a non-zero solution Ao, Bo. But if the latter 
were so, then clearly, 


uo(x, y) = Ay,’ (x, 0) + Bar,*(x, 0) 


is a solution of (I) continuous in y < 0. Moreover, it is a non-trivial solution 
since it is evident that v' and v? are even and odd functions of x respectively, 
and are non-zero. For a proof one uses the uniqueness of the solution of 
(II). The solution of (1) is known to be unique and thus we would have 
uo(x, y) = 0 which in turn implies Ap = By = 0. 

Problem (II) is easily reduced to an integral equation. For if we set 


(2.3) (x,y) = f Hele — 1) + 94S at 


where Hy is the Hankel function of first kind, we have a solution of (II) 
provided only that f satisfies, 











634 R. C. MACCAMY 


(2.4) fH — t|) f(t) dt = h(x) on |x| < a. 


On the basis of Theorem 1, we see that in some fashion problem (I) can be 
reduced to the same equation. However, the technique for doing so without 
using Theorem 1, requires some rather involved computation and we are led 
to the conclusion that (II) is in some sense ‘‘simpler’’ than (I). 

Various generalizations of the above theorem suggest themselves. Clearly 
we could apply the same technique with any elliptic differential equation in 
two variables with constant coefficients. Moreover, the single line segment 
|x| < a, y = 0 could be replaced by a series of such segments along the x-axis. 
The particular extension we wish to discuss concerns a change in the boundary 
conditions (I) and (II). In particular we aim always to replace one problem 
by another in which the first of conditions (II) holds, that is, v,(x,0) = 0 
on |x| > a. In this manner we always reduce our secondary problem to an 
integral equation as in (2.3) and (2.4). To minimize notation we restrict 
ourselves (except in § 5) to Laplace’s equation and to the single strip y = 0, 
|x| <a. 


3. Statement. Let H denote the class of functions u(x, y), continuous 
in y < 0 and harmonic in y < 0 for (x, y) # (+ a, 0). We write 


i teh y= f 
ea “4, ho D f= fd. 


We call the following problem (P 1): Find the function u(x, y) € H such that 
(I) Iu=0 ony=0 |x|>a; Mu=g(x) ony=0 |x| <a. 


Here g(x) is a given function analytic on |x| < a, L is a linear differential 
operator with constant coefficients, 


a b 
(3.1) L=L(D,Y) = > a.D"¥ + > b,D", am, bm constants, 
m=0 


m=0 


and M has the more general form, 


r 


(3.2) M=M(D,Y) = > raD"¥ + >D saD™, 1m) Sm constants. 

m=—p m=—o 
Note that the form (3.1) is the most general differential operator, with constant 
coefficients, for functions in H since higher order derivatives with respect 
to y can always be replaced by derivatives with respect to x. 

If there exists a growth condition as x* + y?—» © which adjoined to 
problem (P I) guarantees that the solution is unique, we say (P I) is unique. 
We leave aside the difficult question of whether such conditions always exist. 
We say (PII) solves (P 1) if the solution of (P I) can be obtained from that 
of (PII) by integration, differentiation and algebraic operations. 

The product of two operators of form (3.2), when applied to functions of 
H, is again an operator of the same type since Y? can be replaced by — D*. 








ON BABINET’S PRINCIPLE 635 


We call the highest positive order of differentiation in an operator (3.2) the 
order of the operator. For an operator of form (3.1), we write L* for L(D, — Y) 
so that L*L involves only x-differentiation on functions of H. 

Consider the special class of problems (P II), 


(Il) Yw=0 ony=0, |x| >a, Mw=h(x) on y=0, |x| <a. 


Let F denote the class of functions f(x), continuous and satisfying a Hélder 
condition on |x| <a with (a? — x*)*f(x) continuous at x = +a for some 
a < 1. Define the linear transformation W(x, y; f) over F by 


+a 
(3.3) w(x, ¥if) = — (x) J f(t) log [(x — #)* + y"} dt. 


Now if w(x, y) = w(x, y;f), f € F, it is clear that w is harmonic in y < 0, 
continuous on y < 0, and satisfies Yw = 0 on y = 0, |x| > a. We can accord- 
ingly make w(x, y) a solution of (P II) by choosing an f € F such that, 


(3.4) MW (x,0;f) = g(x), lx| <a. 


If the solution, w, of (PII) has the representation, (3.3), with an f € F 
satisfying (3.4), we call (PII) representable. We remark that if instead of 
Laplace’s equation we had started with the equation 


Use + Uy — ku = 0, 


all bounded solutions of (P II) would vanish exponentially as x* + y?’— @ 
and their representability would be an immediate consequence of Green's 
theorem (see § 5). For Laplace’s equation representability is complicated by 
the presence of the logarithmic term. 


THEOREM 2. Suppose problem (P 1) (as above) is unique with order L = k, 
and that problem (P II): 


(Il) Yo=0 ony=0, |x| >a, D*L*MYv=h(x) ony=0, |x| <a. 
1s representable. Then (PII) solves (P 1). 


4. Verification. Suppose (P II) can be solved. Let o,(x, y), i = 0,1,... 
2k be solutions of (P II) for 


h(x) = 1,x,...,x*" h(x) = D-*L*L g(x), 


respectively. Set 
2k—1 


(4.1) v(x,y) = 2) Aas t vm, 


t=O 


with constants A, to be determined. Thus 


2k—1 


(4.2) D-™*ML* Yu = D> Aw'+D-™L*Lg(x) ony =0, |x| <a. 


i= 











636 R. C. MACCAMY 


Now suppose we can find u(x, y) such that 
(4.3) Lu = Yo in y <0 


with u harmonic in y < 0. Then we would have Lu = 0 on y = 0, |x| > a. 
Also 


(4.4) L*LMu = L*MYv = D*(D-*L*MY?). 


The relation (4.4) holds first in y < 0. However, with g(x) and hence the 
right side of (4.2) analytic for |x| < a, one sees that v(x, y) can be continued 
across y = 0, |x| < a as a harmonic function. Thus by (4.3) the harmonic 
function u(x, y) can be continued across y = 0, |x| < aso that (4.4) continues 
to hold for y = 0, |x| < a, and by (4.2), 


(4.5) L*L(Mu — g) =0 on y=0, |x| <a. 


Now (4.5) is an ordinary differential equation for Mu — g, with constant 
coefficients and of order 2k in x on y = 0, |x| < a. If we require the vanishing 
of Mu — g and its first 2k — 1 derivatives at the one point x = 0, then we 
guarantee that Mu — g = 0, on y = 0, |x| < a, that is, that u is a solution 


of (PI). Define u,(x, y) (¢ = 0,1,..., 2k) as harmonic functions satisfying 
Lu; = Yo, in y < 0. 
Then the”solution of (4.3) can be written 
2k—1 
(4.6) u(x,y) = Do Ags + um, 


and the conditions (Mu — g)™, m=0,1,...,2k—1, at x = 0, which 
guarantee that Mu = g, become 


2k—1 
(4.7) > Ai(Mu,)” = g” — (Mux)™ aty = 0,x = 0. 
t=1 


This is a system of linear equations for the determination of A,,i = 0,1,..., 
2k — 1. It has a solution unless the homogeneous system, 


2k—1 


(4.8) > AdMu,)”" =0 aty = 0,x = 0. 


i=0 


has a non-trivial solution. Suppose (4.8) has a solution A,° and set 


2k—1 


u° = > A‘u;. 


Then by (4.4) and the definition of the »,, 


2k—1 

L*L Mu’ = D* D-*L*M vy i A‘ = 0 on y = 0 |x| <a. 
But now (4.8) implies this ordinary differential equation has the solution 
Mu® = 0on y = 0, |x| < a. Accordingly u° is a solution of (P I) with g(x) = 0 

















ON BABINET’S PRINCIPLE 637 


and since (P I) is unique we infer u® = 0 in y < 0. Now we apply (4.3) and 
learn that 


2k—1 
Lu’ =0 = v( yx A‘»,) ; iny <0 
and therefore, 


2k—1 2k-1 
Diem y( , oe A'»,) =O0= > Ax’, 


t=—0 


so that A,° = 0. Thus we conclude (4.7) always has a solution. 
We must now indicate the construction (4.3). Since (P II) is representable, 
we can write (4.3) in the form 


(4.9) Lu = Y W(x, y;f) 
for some f € F. Now set 


(4.10) G(x, y,t) = f [L(iw, — w)]exp[— wy + iw(x — t)] dw 


+ f [L(iw, w)]~exp[wy + i(x — t)] dw 


where ~:1(— @,0) and p2(0, ©) are any paths in the w-plane avoiding the 
poles of the integrand and ultimately running along the real axis. G(x, y, t) 
is clearly harmonic in y < 0 and 


ll 


LG(x, y, t) f exp[— wy + iw(x — t)] dw + f exp[wy + i(x — t)] dw 
Pi P2 


Il 


— 0 9 
2yly* + (x — #)"T' = 2 ay log{(x — #)°+y’}'. 


Thus if we define another transformation, U(x, y; f), over F by 


+a 
(4.11) Ute yf) = — 2a ffx, 9,1) a 
—a 
we have 
LU = YW 
and (4.9) is solved by setting u(x, y) = U(x, y;f). 

The paths P; and P, for a particular problem (PI) are dictated by the 
desired behaviour as x* + y? +. In this connection the poles of L(iw, +w) 
are of some heuristic value in guessing what this behaviour should be. 

Finally, we wish to make some remarks concerning the derivatives of Mu 
or now of MU(x,y;f) at x = 0, y = 0. The integrands in (4.9) involve 
reciprocals of polynomials of degree k (or k + 1) in w. Hence, k — 2 (or 
k — 1) differentiations can be carried out directly leaving absolutely con- 
vergent integrals for y = 0. It is not very difficult to see that k — 1 (or k) 
differentiations lead to a function which becomes infinite like log |x — ¢| for 











638 R. C. MACCAMY 


y = 0, x —¢#. Further differentiations require some care. From (4.10) it is 
clear that k — 1 (or k) differentiations of G lead to terms of the form 


ct f exp[— wy + iw(x — t)] dw + J vexptwy + iw(x — t)] aw} re 


the dots indicating terms continuous or at most logarithmically infinite for 
y = 0, x = t. The first terms here are 


, oO = 2 23 oO és 2 2y4 
C ay Bl t)' + y) and C 5 lost t)y'+yy)’, 


corresponding to + and — respectively. Thus k — 1 (or k) differentiations of 
U(x, y; f) yield DW (x, y; f) or YW (x, y; f). From (3.3) we see that YW (x, 0; f) 
= f(x) on |x| < a; hence further derivatives of U(x, y; f) may be computed 
from derivatives of the function f(x) or of W(x, 0; f). in particular the functions 
v,(x, y) are analytic on y = 0, |x| < a. The associated functions f,(x) and 
v(x, y) = W(x, y; fs) are in turn given by f;(x) = Yo,(x, 0) on |x| < a, hence 
are also analytic in |x| < a and have derivatives of all orders at x = 0. 


5. Examples. Let (P I) be the problem of the ‘‘oblique derivative,” that is, 
(I) («@D+B8Y)u=0 ony=0, |x|>a; (yD+éY)u=g(x) on y=0, |x| <a. 
Here we have L*L = (8? + a?)D? and 

L*MY = (ay + 85)D*Y + (By — aéd)D*. 
The “‘solving’’ problem (P II) has then the form 
(II) Yo=0 on y=0 |x|>a; [(ay+5) ¥+(8y—ad)Dlv=h on y=0, |x| <a. 


An explicit solution of (P II) can be obtained which becomes logarithmically 
infinite as x? + y? +o. For setting v(x, y) = W(x, y; f) we see that the first 
of conditions (II) is satisfied while the second requires 

(ay + 83) YW(x, 0; f) + (By — ad) DW(x, 0; f) = h(x) on |x| <a, 


or, 


(5.1) (ey + 88)(By — a8)-(x) — [LO at = (6y — a8) *h(e) 
on |x| <a, 


if By — ad # O. It is easy to see that By — ad = 0 implies the two derivatives 
in (I) are in the same direction. Equation (5.1) was solved explicitly by 
Carleman (2). 

As a second example we consider the diffraction of a two dimensional 
progressive water wave by a rigid dock of finite width. For a physical des- 
cription see (4) or (3). The problem (P I) is 


(I) (¥—K)u=0 on y=0, |x| >a; Yu = g(x) on y =0, |x| <a. 











ON BABINET’S PRINCIPLE 639 
Here L*L = D? + K*, L*MY = D*(Y + K) and hence the solving problem 
(PII) is 
(II) Yo = 0 on y=0, |x| <a; (Y + K)v = A(x) on y =0, |x| <a. 
Again (P II) is solved by setting v(x, y) = W(x, y; f) with 


(Y + K)W(x,0;f) = h(x) on |x| <a 
or 


(5.2) f(x) —Kr"“ “r(t) log |x — t| dt = h(x). 


Equation (5.2) is no longer explicitly solvable, but there exists a systematic 
procedure for its numerical solution as indicated in (3). We remark that the 
harmonic conjugate, v(x, y) of v(x, y) satisfies by (II), 


(II)’ v =0o0n y = 0, |x| < a; (— D?+ KY)p = h(x) on y = O, |x| < a; 
and this is the problem considered in (4). 
It is shown in (3) that (P I) is unique and that a solution exists. Although 


we shall not carry through the computation, these facts can be used to establish 
that (P II) is unique and representable. 


As a final example we wish to discuss the same water wave problem when 
the incident wave strikes the dock at an angle. In this case the boundary 
conditions for (PI) are the same, but the equation is 
(5.3) Ure + Uy — ku = 0. 

A minor modification of Theorem 2 shows that (PI) is solved by problem 
(P II), that is, find a solution of (5.3) such that 
(II) Yu = 0 on y=0, |x| >a; (Y + K)v = h(x) on y = 0, |x| <a. 


From the first of conditions (II), v(x, y) may be continued across y = 0, 
|x| > a so as to be a single valued solution of (4.3) in x? + y? > a®. If it is 
bounded, it follows that 
(5.4) v(x, y) = O (exp[— k(x? + y?)?)}) as x? + yo, 
Problem (PII) can then be reformulated as a positive definite variational 
problem, namely: 

Among all functions, w(x, y), satisfying (5.4) and with 


Flw] = ff [ws + w, + k’w") dx dy + f (w — h)*dx < @, 
y<0 ~s 


find that one minimizing F[w]. 
The existence of a solution of (PII) under condition (5.4) is easily established. 
For uniqueness follows immediately from Green's theorem and we can set 


+a 
v(x,y) = — (we) | f(t) Ko(k[(x — #)* + »’J") dt, 





640 R. C. MACCAMY 


where Ky is the singular Bessel function serving as fundamental solution of 
(5.3). This is representability for equation (5.3). Then (5.3) is satisfied as 
is (5.4) and the first of condition (II). The second of (II) will also be satisfied 
if f is a solution of 


f(x) = K(x)” H(t) Kolklx — t|) dt = h(x) on |x| < a. 


Fredholm theory applies and the uniqueness theorem guarantees the homo- 
geneous equation has no non-trivial solution. It follows that the variation 
problem also has a solution. 

It is rather interesting that (PII) can be formulated as a variational 
problem for (P 1) cannot be so formulated directly, being in reality an oscil- 
lation problem with infinite energy. (P II) finds its natural physical prototype 
in steady state heat flow, an equilibrium phenomena with finite energy. This 
fundamental difference in the physical character of (PI) and (PII) seems 
to the author to enhance the interest of the fact that the one can be reduced 
to the other. 


REFERENCES 


1. B. Baker and E. T. Copson, Huygens’ Principle (Oxford, 1950). 

2. T. Carleman, Sur la résolution de certaines equations intégrales, Arkiv. for Mat., Astro. och 
Fys., 16 (1922), No. 26. 

3. R. C. MacCamy, Linear boundary problems arising in the diffraction of water waves by 
surface obstacles, Part II, Technical Report No. 2, N. our-222(25) (University of 
California at Berkeley, Nov. 1955). 

4. H. Rubin, The dock of finite extent, Comm. Pure and App. Math., 7 (1954), 317-344. 


Carnegie Institute of Technology 
and 
University of California, Berkeley 





: 
“a 
ol 
is 
ed ~ 
a 
0- 
yn 








Ww » & 28 


— 


LECTURES ON ORDINARY DIFFERENTIAL EQUATIONS 


By the late Witold Hurewicz, formerly of the Massachusetts Institute of Tech- 
nology. A Technology Press Book, M.L.T. 1958. 122 pages. $5.00. 


INTRODUCTION TO DIFFERENCE EQUATIONS 


With Illustrative Examples from Economics, Psychology, and Sociology 
By Samuel Goldberg, Oberlin College. 1958. 260 pages. $6.75. 


SWITCHING CIRCUITS AND LOGICAL DESIGN 


By Samuel H. Caldwell, Massachusetts Institute of Technology. 1958. 686 pages. 
College Edition $11.75. 


SURVEYS IN APPLIED MATHEMATICS 


Vol. I—Elasticity and Plasticity. By J. N. Goodier, Stanford University; and P. G. 
Hodge, Jr., Illinois Institute of Technology. 1958. 152 pages. $6.25. 

Vol. Il—Dynamics and Nonlinear Mechanics. By E. Leimanis, University of British 
Columbia; and N. Minorsky, Stanford University. 1958. 206 pages. $7.75. 

Vol. I1]—Mathematical Aspects of Subsonic and Transonic Gas Dynamics. By 
Lipman Bers, New York University. 1958. 164 pages. $7.75. 

Vol. IV—Some Aspects of Analysis and Probability. By I. Kaplansky, The University 
of Chicago; E. Hewitt, University of Washington; M. Hall, E Ohio State University; 
and R. Fortet, Institute Henri Poincare. 1958. 243 pages. $9.00. 

Vol. V—Numerical Analysis and Partial Differential Equations. By G. E. Forsythe, 
Stanford University; and P. C. Rosenbloom, University of Minnesota. 1958. Approx. 
204 pages. $7.50. 


SOME ASPECTS OF MULTIVARIATE ANALYSIS* 
By S. N. Roy, University of North Carolina. 1958. 214 pages. $8.00. 


AN INTRODUCTION TO COMBINATORIAL ANALYSIS* 
By John Riordan, Bell Telephone Laboratories. 1958. 244 pages. $8.50. 


INTRODUCING MATHEMATICS 
By Floyd F. Helton, Morrison Observatory, Central College. 1958. 396 pages. $5.75. 


FINITE QUEUING TABLES 


By L. G. Peck and R. N. Hazelwood; both of Arthur D. Little, Inc., Mass. 
Publications in Operations Research #2. 1958. 210 pages. $8.50. 


*One of the Wiley Publications in Statistics, Walter A. Shewhart and S. S. Wilks, 
Editors. 
Send for examination copies. 
University of Toronto Press, Toronto, Ontario 
and Renouf Publishing Company, Montreal, Quebec 








